Study Bible Notes Interchange Format


Document Status: Draft Proposal

The goal of this document is to describe a possible interchange format for study Bible notes. A consistent format for interchange would make it easier for various creators to share study Bible notes with one another and allow the various software systems to import and export notes without losing data in the process.

In an effort to make it easy to read and write these files, we are suggesting a TSV format with set list of columns.

TSV Format Overview

A Tab Separated Value (TSV) file is like a Comma Separated Value file except that the tab character is what divides the values instead of a comma. This makes it easier to include prose text in the files because many languages require the use of commas, single quotes, and double quotes in their sentences and paragraphs.

The study notes should be structured as one file per book of the bible and encoded in TSV format, for example, 01-GEN.tsv. The columns are proposed to be Book, Chapter, Verse, ID, Tags, Note, OrigQuote, Occurrence, and GLQuote.

tN TSV Column Description

The following lists each proposed column with a brief description and example.

  • Book - USFM book code name (e.g. TIT)
  • Chapter - Chapter number (e.g. 01)
  • Verse - Verse number (e.g. 03)
  • ID - Four character alphanumeric string unique within the verse for the resource (e.g. k3cb)
    • This will be helpful in identifying which notes are translations and which notes have been added by other parties.
    • This would also be a useful way to unambiguously refer to specific notes. An RC link could resolve to a specific note like this: rc://en/ubn/help/tit/01/01/a8n4.
    • Note that these are not necessarily universally unique among everyone’s notes, but they should be unique among a given group’s notes, for that Book, Chapter, Verse combination (1,679,616 possibilities per verse).
  • Tags - Comma separated list of tags that apply to the note (e.g. OT,Narrative,Reformed,GloryofGod)
    • There is no standard for what these tags should be and they will likely include multiple categories of tagging (some tags may refer to theological tradition like Reformed, Lutheran, Baptist and others may refer to the topic of the note itself like Salvation, Humility, Sacrifice or anything the notes creators choose.
    • It is suggested that creators choose tags that are meaningful in a broader context since other software and users ingesting the tags will need to figure out the categories.
  • Note - The Markdown formatted note itself
    • The text should be Markdown formatted, which means the following are also acceptable:
      • Plaintext - if you have no need for extra markup, just use plain text in this column
      • HTML - if you prefer to use inline HTML for markup, that works because it is fully supported in Markdown
  • OrigQuote (OPTIONAL) - Original language quote (e.g. ἐφανέρωσεν ... τὸν λόγον αὐτοῦ)
    • An elipsis (…) indicates that the quote is discontinuous, software should interpret this in a non-greedy manner
    • If provided, this will allow software to highlight words or phrases in the original language and in an aligned translation for the user to see.
    • This field may include Strong’s numbers instead of original language text, but highlighting will be limited in that case.
  • Occurrence (OPTIONAL, but required if OrigQuote is provided) - Specifies which occurrence in the original language text the entry applies to.
    • -1: entry applies to every occurrence of OrigQuote in the verse
    • 1: entry applies to first occurrence of OrigQuote only
    • 2: entry applies to second occurrence of OrigQuote only
    • etc.
  • GLQuote (OPTIONAL) - Bible translation quote (e.g. he revealed his word)
    • This field is intended as a reference text for translators
    • It may be used for display to a user of the notes, however, it may not match exactly the text of the Bible translation that is presented to the user.

The notes could be packaged in a Resource Container bundle. This would require an additional manifest.yaml file that would be very useful for providing project level metadata. For example, we could specify that the text referenced in OrigQuote or GLQuote be specified in the relation field of the manifest. This also add a structural requirement that would make it more consistent for software to import and export the data.


What do you think of this proposal? Does it make sense? Would it be helpful for you or your team if we made these changes? What are the drawbacks of this approach? Please comment with any thoughts you have.

1 Like

This looks good Jesse ! Can the UBN notes be converted to this ? Do the UBN headings for notes contain all that this new format will need ?

Yes, the existing content that we’ve created can be converted into this format. We don’t have the optional fields but the other fields we have by pivoting the Markdown files. Of course, we would auto-generate the IDs so that they are unique.