Case Study

How does DitaBase look in practice?


Let's suppose Jane has a product catalog:

SKU Product Name Category Price
TRO-5593 TriCool Aluminum
Water Bottle
Water Bottles $16.99

She needs to upload this to her own Shopify site, Amazon, Walmart, Google Shopping and about 20 other sites. However, each site has a slightly different schema, with tedious changes:

SKU Product Name Title Category Department Price Sale Price
TRO-5593 TriCool Aluminum
Water Bottle
Water Bottles
Sports & Fitness : Accessories : Sports Water Bottles
16,99 USD

To deal with this, Jane manually creates and edits a different spreadsheet for every site. It is very annoying and time consuming. She has no way of knowing if there are mistakes until she attempts to upload the sheet to a website, and it gets rejected.

Fortunately, Jane only has 50 products to worry about. But this means adding new products, or expanding to more websites is very difficult, if not impossible.

How can DitaBase make this easier?

Using dit, a developer could create a ProductAmazon object. The product data would be represented in the .dit as JSON, XLSX, or any other format. This makes it easy to work with a variety of tools and data. When published from the dit, it would exactly resemble a tab delimited CSV file with all of the fields Amazon requires. As long as it inherits from Product and FormatTabCSV, a lot of the work is already done. They just need to write validators for each little custom field Amazon uses, like BrandName, Department, and so on.

This means Jane can write a single spreadsheet that extends Product, and the format for whatever spreadsheet program she would like to use. The program could have a DitaBase plugin which validates the fields in real time. If there are errors, she knows immediately, and can correct them much more quickly. Then with a single click, the spreadsheet can be converted to ProductAmazon, ProductShopify, etc. The process can still be tedious, but will now take hours, not weeks.

  • Breaking large formats into components and sharing them allows an enormous amount of validation code to be written by people who never communicate. Making a full size validator for ProductAmazon would be a huge undertaking by yourself, and almost certainly not worth the effort. But with the power of open source, shared work, it becomes achievable, and even cheaper in the long run.

  • DitaBase allows anyone to create unintended interoperability between unrelated systems without informing anyone, or asking for permission. Amazon never has to lift a finger to allow the creation of the ProductAmazon object, and couldn't stop it if they wanted to either.


Sara is the CTO of large medical company, which just acquired a small medical startup. Sara's company has an old codebase, of which one component is a MySQL database. The startup has an amazing API and codebase, which includes MongoDB.

Somehow, Sara wants to move over to using the new codebase, but this must be handled delicately. Ideally, the MySQL and MongoDB could both be live in the interim, without developing an unnecessary API for the old codebase. To further complicate things, the old code base uses a variety of outdated schemas, full of useless fields and missing important new ones.

With DitaBase...

Sara can start by making a custom PersonLegacy object that matches the old schema. This should inherit from Person and FormatMySQL. Then, a brand new PersonCompany object with FormatJSON. Finally, converters to go between. Any part of the codebase can access new customers as though they were the old system, and old customers as though they were new. However long the transition takes, there's no rush due to data downtime.

PersonID Name Age Blood Type
GUL-89323 John Doe 36 O-
MAL-34106 Jane Doe 67 AB+
PersonID ConditionID
MAL-34106 1342
MAL-34106 1305
ConditionID Condition
1342 Arthritis
1305 High Blood Pressure
    "people": [
        "_id": "GUL-89323",
        "name": "John Doe",
        "age": 36,
        "blood-type": "O-"
        "_id": "MAL-34106",
        "name": "Jane Doe",
        "age": 67,
        "blood-type": "AB+",
        "conditions": [
          "High Blood Pressure"
  • Abstracting data from a format is very easy with dit objects, even if the formats are very different.
  • Dit allows for very easy versioning. Each version is just another object that inherits from some common parent. Mix and match inheritance styles for endless customization.

Cross Industry

Now let's meet Tom. Tom is laying out a construction project, up in the mountains. He contracted with a Land Surveying company, which produced topographic data that looks like this:

GPS Point North East Elevation Datum
t-125 24504.145 17948.076 3543.846 NAD83
t-126 24508.491 17950.059 3543.347 NAD83

Tom also contracted with a satellite imaging service, which produced a series of photographs so that Tom would know the number and location of trees and other land features. The photographs came with metadata which give estimates of coordinates and number of trees:

  "images": [
      "file": "19-05-17-10:30:45:231.png",
      "coordinate": "41°29'14.1\"N 76°50'14.9\"W",
      "trees": [
        "41°29'12.5\"N 76°50'10.6\"W",
        "41°29'12.1\"N 76°50'10.6\"W",
        "41°29'11.8\"N 76°50'10.7\"W",
        "41°29'11.1\"N 76°50'10.4\"W"

Tom needs to know where to place buildings, roads, power lines, sewer, and more. The more trees that come down, the greater the costs, both in cutting services and government conservation taxes. It would save Tom a lot of time and money to be able to work with a single dataset instead of each one separately.

Unfortunately, their data doesn't look alike at all, even though they are theoretically similar industries. Even minor differences in the data means extra work for Tom, and this data isn't even close to similar.

This also isn't the first time Tom has needed two dissimilar survey data sets to work together. The last time, Tom couldn't find another solution and hired a programmer to write custom scripts. But that was a one time solution and won't work here. What a waste of money!

DitaBase can help

Ideally, one or both of the companies would offer their data in a dit format. Instead of a plain string with latitude and longitude, Tom would be greeted by a child of the Coordinate object. This way, Tom can use free tools on the DitaBase website to convert both datasets to some compatible format, and do the rest manually. Even though it's an odd, one time situation, DitaBase still helps Tom enormously.


As with most technologies, the more developers using DitaBase, the better it gets. But unique to DitaBase, this also applies to data. The more data in dit objects, the more likely it is that any problem found in data already has a solution. It also becomes less likely to encounter errors in general.

This means people like Tom who don't know anything about validation or formats care very much whether data is on the dit ecosystem. That makes dit data much more valuable.