Back to the Couchbase homepageCouchbase logo
Couchbase Developer

  • Docs

    • Integrations
    • SDKs
    • Mobile SDKs

    • AI Developer
    • Backend
    • Full-stack
    • Mobile
    • Ops / DBA

    • Data Modeling
    • Scalability

  • Tutorials

    • Developer Community
    • Ambassador Program
  • Sign In
  • Try Free

Standardization, Optimization, and Resources

  • Learn about standardization and optimization of your data model
  • Explore some resources to dive deeper into document optimization
  • See examples of standardized fields and why they are recommended

Wrapping up our Data Modeling Guide, we talk about standardization and optimization of your documents and modeling and share some resources to help you dive deeper.

Standardized Fields

docType

We have discussed about docType in the earlier section above. Please refer Document Key Prefixing section in this document for details on docType.

Delimiter

Every Document ID can be a combination of two or more parts/values, that should be delimited by a character such as a colon or an underscore. Pick a delimiter, and be consistent throughout your enterprise.

It is a best practice to only use a single-byte delimiter since this can make a significant difference based on the volume of data.

Schema

Applications are typically versioned using Semantic Versioning, i.e. 2.5.1. Where:

  • 2 is the major version
  • 5 is the minor version
  • 1 is bugfix/maintenance version

Versioning the application informs users of features, functionality, updates, etc. The term "schemaless", is often associated with NoSQL, while this is technically correct, it is better stated as:

Note: "There is no schema managed by the database, however, there is still a schema, and it is an "Application Enforced Schema." The application is now responsible for enforcing the schema as well as maintaining the integrity of the data and relationships".

As schemas change and evolve, documenting the version of the schema provides a mechanism of notifying applications about the schema version of the document that they're working with. This also enables a migration path for updating models which is discussed further in the Schema Versioning section.

{
  "_type": "user",
  "_schema": "1.2",
  "userId": 123
}

Please refer to Document Management Strategies document for a more thorough discussion of schema versioning.

Namespacing

The use of a leading _ creates a standardized approach to global attributes across all documents within the enterprise.

{
  "_type": "user",
  "_schema": "1.2",
  "_created": 1544734688923
  "userId": 123
}

The same can be applied through a top-level property i.e. "meta": {}.

{
  "meta": {
    "type": "user",
    "schema": "1.2",
    "created": 1544734688923
  },
  "userId": 123
}

Choose an approach that works within your organization and be consistent throughout your applications.

Optimizations

JSON gives us a flexible schema, that allows our models to rapidly adapt to change, this is because the schema is explicitly stored alongside each value. Whereas, in an RDBMS the schema is defined by the table columns, which are defined once. In any database, every byte of stored data adds up, historically this has been abstracted from developers as the schema and the database are managed by a DBA. With an application enforced schema, the model size is now controlled by the application. As developers we tend to be overly verbose when describing variables throughout our applications, this practice tends to carry over to our JSON models. While it is generally preferred to maintain human-readable field names for developer productivity, there are often well-understood abbreviations for many fields that will not reduce document readability.

As a general approach, consider the following options to proactively reduce document sizes:

  • Don't store the document ID as a repeated value in the document.
  • Convert ISO-8601 timestamps to epoch time in milliseconds, saving at least 11 bytes. When millisecond precision is not required, convert to a smaller value (i.e. divide by 1000 to convert to seconds, 60 for minutes, 60 for hours, 24 for days), saving at least 4 bytes.
  • Store dates as an ISO format YYYY-MM-DD instead of MMM DD, YYYY.
  • When using GUID's strip all dashes saving an additional 4 bytes per GUID.
  • Use shorter property names.
  • Don't store properties whose value is null , empty String/Array/Object, or a known default.
  • Don't repeat values in arrays whose value is not unique, use a top-level property on the document.

Storing Dates

It is very common in almost any application, there is a need to store a date. This could be when the document was created, modified, when an order was placed, etc. Generally, this date is stored in ISO-8601 format.

Take the date 2018-12-14T03:45:24.478Z as an example, this is very readable, but is it the most efficient way to store the date? Storing this same date as Unix Epoch Time we can represent this same date as 1544759124478. ISO-8601 is 24 bytes, where epoch format is 13 bytes, this saves 11 bytes. This might not seem like a lot, but consider this scenario: 500,000,000 documents and each document has an average of 2 date properties. If we used epoch format, we'd save 11,000,000,000 bytes or 11Gb of space.

Now, take this a step further and ask the question, "What level of precision does the application require?". Often times we do not need millisecond precision, we can divide the epoch date accordingly for seconds, minutes, hours, etc. This applies if dates are being stored in Epoch format.

Epoch Date Precision Reduction Output Length / Bytes
1544759124478 milliseconds n/a 1544759124478 13
1544759124478 seconds / 1000 1544759124 10
1544759124478 minutes /1000 / 60 25745985 8
1544759124478 hours /1000 / 60 / 60 429099 6
1544759124478 days /1000 / 60 / 60 / 24 17879 5

Please refer Document Management Strategies guide for a more in-depth discussion of this topic.

Resources

  • JSON Data Modeling for RDBMS Users
  • Data Modeling for Couchbase with erwin DM NoSQL
  • SQL to JSON Data Modeling with Hackolade
  • Moving from SQL Server to Couchbase - Data Modeling

This tutorial is part of a Couchbase Learning Path:
PREV
Document Key Design
HOME
JSON Data Modeling Guide
Contents
Couchbase home page link

3250 Olcott Street
Santa Clara, CA 95054
United States

  • company
  • about
  • leadership
  • news & press
  • investor relations
  • careers
  • events
  • legal
  • contact us
  • support
  • Developer portal
  • Documentation
  • Forums
  • PROFESSIONAL SERVICES
  • support login
  • support policy
  • training
  • quicklinks
  • blog
  • downloads
  • get started
  • resources
  • why nosql
  • pricing
  • follow us
  • Social Media Link for FacebookFacebook
  • Social Media Link for TwitterTwitter
  • Social Media Link for LinkedInLinkedIn
  • Social Media Link for Youtubeyoutube
  • Social Media Link for GitHubGithub
  • Social Media Link for Stack OverflowStack Overflow
  • Social Media Link for Discorddiscord

© 2025 Couchbase, Inc. Couchbase and the Couchbase logo are registered trademarks of Couchbase, Inc. All third party trademarks (including logos and icons) referenced by Couchbase, Inc. remain the property of their respective owners.

Terms of UsePrivacy PolicyCookie PolicySupport PolicyDo Not Sell My Personal InformationMarketing Preference Center