Data made simple

Dit is a new kind of container file that can hold any data, with scripts and schemas to organize it. Managing data across platforms, contexts, and industries has never been easier.

{
  "sku": {
    "value": "1510_AAM-2000",
    "validator": {
      function validateSKU(value) { 
        return /^\d\d10_[A-Z]{3}-\d{4}$/.test(value);
      }
    }
  }
}
DitaBase    Schema.org Logo   
GitHub Logo    Scripts 
Schema.org Logo
GitHub Logo
Scripts

DitaBase

What is DitaBase?

At its core, DitaBase is a developer tool that makes data more useful. It works with any and all data. DitaBase achieves this by combining a variety of proven ideas together.

The DitaBase file-type, or dit, is a new kind of container file that incorporates schemas, scripts and data together.
Dits can contain anything.

Schema Repository
  • Fully customizable schemas, unlike schema.org.
  • Extend a schema, add fields, alter formats to match your actual data.
  • Any schema can be in any container — JSON, XML, YAML etc.
  • Schemas are directly integrated into dits.
<schema>
  <name>product</name>
  <token>object</token>
  <fields>
    <field>title</field>
    <field>sku</field>
    <field>description</field>
    <field>price</field>
    <field>gtin</field>
  </fields>
</schema>

              
Validation Scripts
  • All items — containers, fields, etc. — have validators.
  • Ensures data is always formatted according to its schema, no matter its origin.
  • Anyone can work with obscure schemas by using the validators as unit tests.
  • JavaScript and Python scripts supported out of the box, any interpreted language can be added.
"schema": {
  "name": "RAM modules"
  "meta-example": "1 x 8GB"
  "validator": {
    def validate(value):
      if not isinstance(value, str):
        return (False, "error: '" + str(value) + "' is not a string")
          
      modules = value[0:-2]
      modulesEnum = ["1 x 8", "2 x 4", "1 x 16", "2 x 8", "2 x 16"]
      
      if not modules in modulesEnum:
        return (False, "error: '" + modules + "' not found in enum")
          
      return True
  }
}
              
Conversion Scripts
  • Schema-to-schema conversion for similar schemas
  • Convert containers, English to Metric units, nuanced date formats, and anything in between.
  • Data can use any format, because it can be easily converted to a different one.
  • Converters can be found on the project page for a schema, or in the dits, as preferred.
converter:
  name: dollarToUSD
  expected-in: moneyDollarSign
  expected-out: moneyUSDNoSign
  code: |
    (value) => {
      return `${value.replace('$', '').replace('.', ',')} USD`;
    }
converter:
  name: USDToDollar
  expected-in: moneyUSDNoSign
  expected-out: moneyDollarSign
  code: |
    (value) => {
      return `$${value.replace(' USD', '').replace(',', '.')}`;
    }

              
Online Version Control
  • Open Source schemas belong entirely to the community.
  • Anyone can make pull requests to suggest changes to schemas.
  • All public scripts and schemas are easily found online.
  • Git keeps track of changes.

The .dit file type

One File Type to Rule them All

  • Dits can contain anything.
  • Schemas and scripts allow dits to change their container, styling, everything, on the fly.
  • Each field contains a label which uses scripts and schemas to describe the payload.
  • Payloads can contain other fields, primitive values, binary, or entire other dit files.
  • All file formats can be in a dit: .pdf .jpg .dwg .xlsx .pproj .gcode etc.
  • In the future, there will be no need for file extensions: every file will be a dit.
dit file
[Not supported by viewer]
label
[Not supported by viewer]
payload
[Not supported by viewer]
field(s)
[Not supported by viewer]
label
[Not supported by viewer]
payload
[Not supported by viewer]
field(s) etc.
[Not supported by viewer]
See a WIP demo of a dit.

Heard enough? Check out the survey, or contact isaiah@ditabase.io to learn more!

Case Study

How would DitaBase look in practice?


Ecommerce

Let's suppose Jane has a product catalog:

SKU Product Name Category Price
TRO-5593 TriCool Aluminum
Water Bottle
Water Bottles $16.99

She needs to upload this to her own Shopify site, Amazon, Walmart, Google Shopping and about 20 other sites. However, each site has a slightly different schema, with tedious changes:

SKU Product Name Title Category Department Price Sale Price
TRO-5593 TriCool Aluminum
Water Bottle
Water Bottles
Sports & Fitness : Accessories : Sports Water Bottles
$16.99
16,99 USD

To deal with this, Jane manually creates and edits a different spreadsheet for every site. It is very annoying and time consuming.

Fortunately, Jane only has 50 products to worry about. But this means adding new products, or expanding to more websites is very difficult, if not impossible.

How can DitaBase make this easier?

On DitaBase, someone may have already made a ProductEcommerceAmazon schema, which would be a child of the ProductEcommerce schema. The schemas would have all the validators and converters necessary to convert between them.

This means Jane can simply make one spreadsheet that complies with ProductEcommerce and then automatically convert the data to every other website. If a website doesn't have a schema, she can easily hire a freelancer to write one for that site.


Cross Industry

Now let's meet Tom. Tom is laying out a construction project, up in the mountains. He contracted with a Land Surveying company, which produced data that looks like this:

GPS Point North East Elevation Datum
t-125 24504.145 17948.076 3543.846 NAD83
t-126 24508.491 17950.059 3543.347 NAD83

Tom also contracted with a satellite imaging service, which produced a series of photographs so that Tom would know the number and location of trees and other land features. The photographs came with metadata which give estimates of coordinates and number of trees:

{
  "images": [
    {
      "file": "19-05-17-10:30:45:231.png",
      "coordinate": "41°29'14.1\"N 76°50'14.9\"W",
      "trees": [
        "41°29'12.5\"N 76°50'10.6\"W",
        "41°29'12.1\"N 76°50'10.6\"W",
        "41°29'11.8\"N 76°50'10.7\"W",
        "41°29'11.1\"N 76°50'10.4\"W"
      ]
    }
  ]
}

It would save Tom a lot of time and money to be able to work with a single dataset instead of each one separately. Unfortunately, their data doesn't look alike at all, even though they are theoretically similar industries.

This also isn't the first time Tom has needed two dissimilar survey data sets to work together. The last time, Tom couldn't find another solution and hired a programmer to write custom scripts. But that was a one time solution and won't work here. What a waste of money!

With DitaBase...

If both or even one of the companies had their data extend the Coordinate schema, then Tom could convert both data sets that far and do the rest manually. Even though it's an odd, one time situation, DitaBase still helps Tom enormously.


Inter-Company

Sara is the CTO of large medical company, which just acquired a small medical startup. Sara's company has an old codebase, of which one component is a MySQL database. The startup has an amazing API and codebase, which includes MongoDB.

Somehow, Sara wants to move over to using the new codebase, but this must be handled delicately. Ideally, the MySQL and MongoDB could both be live in the interim, without developing a costly and unnecessary API for the old codebase.

DitaBase again!

Sara knows that all their customer data will be carried over using the old schema. This means that they can write a child of the DitaBase PersonMedical schema, and have a version for MongoDB and SQL.

DitaBase makes abstracting the container from the data itself very easy. Add a two way converter and the databases can be used as though they were a single system.

PersonID Name Age Blood Type
GUL-89323 John Doe 36 O-
MAL-34106 Jane Doe 67 AB+
PersonID ConditionID
MAL-34106 1342
MAL-34106 1305
ConditionID Condition
1342 Arthritis
1305 High Blood Pressure
{
  "people": [
    {
      "_id": "GUL-89323",
      "name": "John Doe",
      "age": 36,
      "blood-type": "O-"
    },
    {
      "_id": "MAL-34106",
      "name": "Jane Doe",
      "age": 67,
      "blood-type": "AB+",
      "conditions": [
        "Arthritis",
        "High Blood Pressure"
      ]
    }
  ]
}

Survey

Please take a moment to answer a few questions!

Or contact isaiah@ditabase.io to learn more.