BCON

BSON C Object Notation


Gary J. Murakami, Ph.D.

gary.murakami@10gen.com


mongo-c-driver


Driver Days

Tuesday, September 18, 2012

The Problem


    { "BSON" : [ "awesome", 5.05, 1986 ] }
    

C Driver


    bson bs[1];
    bson_init( bs );
    bson_append_start_array( bs, "BSON" );
    bson_append_string( bs, "0", "awesome" );
    bson_append_double( bs, "1", 5.05 );
    bson_append_int( bs, "2", 1986 );
    bson_append_finish_array( bs );
    bson_finish( bs );
    
  • laborious
  • unreadable
  • unmaintainable
  • not fun

The Solution


    { "BSON" : [ "awesome", 5.05, 1986 ] }
    

BCON: BSON C Object Notation


    bcon bc[] = { "BSON", "[", "awesome", BF(5.05), BI(1986), "]", "." };
    

  • C feature set
    • no nested polymorphic arrays or dictionaries
  • C initializer
    • compiled data, not code

Comparison


    { "BSON" : [ "awesome", 5.05, 1986 ] }
    

BCON


    bson b[1];
    bcon bc[] = { "BSON", "[", "awesome", BF(5.05), BI(1986), "]", "." };
    bson_from_bcon( b, bc );
    

C Driver


    bson bs[1];
    bson_init( bs );
    bson_append_start_array( bs, "BSON" );
    bson_append_string( bs, "0", "awesome" );
    bson_append_double( bs, "1", 5.05 );
    bson_append_int( bs, "2", 1986 );
    bson_append_finish_array( bs );
    bson_finish( bs );
    

More Examples


    bcon goodbye[] = { "hello", "world", "goodbye", "world", "." };

    bcon contact_info[] = {
        "firstName", "John",
        "lastName" , "Smith",
        "age"      , BI(25),
        "address"  ,
        "{",
            "streetAddress", "21 2nd Street",
            "city"         , "New York",
            "state"        , "NY",
            "postalCode"   , "10021",
        "}",
        "phoneNumber",
        "[",
            "{",
                "type"  , "home",
                "number", "212 555-1234",
            "}",
            "{",
                "type"  , "fax",
                "number", "646 555-4567",
            "}",
        "]",
        BEND
    };
    

Implementation - C Union


    typedef union bcon {
      char *s;       /**< 02 e_name string    Macro BS(v) - UTF-8 string */ /* must be first to be default */
      double f;      /**< 01 e_name double    Macro BF(v) - Floating point */
      char *o;       /**< 07 e_name (byte*12) Macro BO(v) - ObjectId */
      bson_bool_t b; /**< 08 e_name 00        Macro BB(v) - Boolean "false"
                          08 e_name 01        Macro BB(v) - Boolean "true" */
      time_t t;      /**< 09 e_name int64     Macro BT(v) - UTC datetime */
      char *v;       /**< 0A e_name           Macro BNULL - Null value */
      char *x;       /**< 0E e_name string    Macro BX(v) - Symbol */
      int i;         /**< 10 e_name int32     Macro BI(v) - 32-bit Integer */
      long l;        /**< 12 e_name int64     Macro BL(v) - 64-bit Integer */
    } bcon;
    
  • choose char* cstring as the default type
  • first union field is the default
  • document is C array of bcon union elements
  • keys, values, documents, BSON arrays

Special cstrings

  • use "." or BEND macro for termination
  • use "{" and "}" for document
  • use "[" and "]" for array
  • use special cstrings for type specification

    /** BCON internal 01 double Floating point type-specifier */
    #define BTF ":_f:"
    /** BCON internal 02 char* string type-specifier */
    #define BTS ":_s:"
    /** BCON internal 07 char* ObjectId type-specifier */
    #define BTO ":_o:"
    /** BCON internal 08 int Boolean type-specifier */
    #define BTB ":_b:"
    /** BCON internal 09 int64 UTC datetime type-specifier */
    #define BTT ":_t:"
    /** BCON internal 0A Null type-specifier */
    #define BTN ":_v:"
    /** BCON internal 0E char* Symbol type-specifier */
    #define BTX ":_x:"
    /** BCON internal 10 int32 64-bit Integer type-specifier */
    #define BTI ":_i:"
    /** BCON internal 12 int64 64-bit Integer type-specifier */
    #define BTL ":_l:"
    

Value Macros


    /** BCON 01 double Floating point value */
    #define BF(v) BTF, { .f = (v) }
    /** BCON 02 char* string value */
    #define BS(v) BTS, { .s = (v) }
    /** BCON 07 char* ObjectId value */
    #define BO(v) BTO, { .o = (v) }
    /** BCON 08 int Boolean value */
    #define BB(v) BTB, { .b = (v) }
    /** BCON 09 int64 UTC datetime value */
    #define BT(v) BTT, { .t = (v) }
    /** BCON 0A Null value */
    #define BNULL BTN, { .v = ("") }
    /** BCON 0E char* Symbol value */
    #define BX(v) BTX, { .x = (v) }
    /** BCON 10 int32 32-bit Integer value */
    #define BI(v) BTI, { .i = (v) }
    /** BCON 12 int64 64-bit Integer value */
    #define BL(v) BTL, { .l = (v) }
    

value macros expand to two bcon elements

clarity and conciseness

Reference Interpolation

  1. modify values
  2. call bson_from_bcon to interpolate values

    bson b[1];
    char name[] = "pi";
    double value = 3.14159;
    bcon bc[] = { "name", BRS(name), "value", BRF(&value), BEND };
    bson_from_bcon( b, bc ); /* generates { name: "pi", value: 3.14159 } */
    if (verbose) bson_print( b );
    strcpy(name, "e");
    value = 2.71828;
    bson_from_bcon( b, bc ); /* generates { name: "e", value: 2.71828 } */
    if (verbose) bson_print( b );
    
  • & precedes macro value types, no & before C array types
  • bcon fields, e.g., double *Rf;
  • type-specifiers, e.g., #define BTRF ":Rf:"
  • value macros, e.g., #define BRF(v) BTRF, { .Rf = (v) }

Pointer Interpolation

  1. set pointer to null to skip key-value interpolation
  2. or set pointer to new value locations
  3. call bson_from_bcon to interpolate values

    bson b[1];
    char name[] = "pi";
    char new_name[] = "log(0)";
    char **pname = (char**)&name;
    double value = 3.14159;
    double *pvalue = &value;
    bcon bc[] = { "name", BPS(&pname), "value", BPF(&pvalue), BEND };
    bson_from_bcon( b, bc ); /* generates { name: "pi", value: 3.14159 } */
    pname = (char**)&new_name;
    pvalue = 0;
    bson_from_bcon( b, bc ); /* generates { name: "log(0)" } */
    
  • & precedes all macro pointer arguments
  • bcon fields, e.g., double **Pf;
  • type-specifiers, e.g., #define BTPF ":Pf:"
  • value macros, e.g., #define BPF(v) BTPF, { .Pf = (v) }

Types and Interpolation Notes

  • full spectrum of base types, document, and array
  • "direct" value, reference interpolation, and pointer interpolation
  • C array reference semantics
    • completeness/consistency vs. redundancy
      • BS, BD, BA, BO, BX <--> BRS, BRD, BRA, BRO, BRX


Not Implemented


        /*   05  e_name  binary              Binary data */
        /*   0B  e_name  cstring cstring     Regular expression */
        /*   0D  e_name  string              JavaScript code */
        /*   0F  e_name  code_w_s            JavaScript code w/ scope  */
        /*   11  e_name  int64               Timestamp */
        /*   FF  e_name                      Min key */
        /*   7F  e_name                      Max key */
        

opportunity

BCON Functions


    bcon_error_t bson_append_bcon(bson *b, const bcon *bc);

    bcon_error_t bson_from_bcon( bson *b, const bcon *bc );

    void bcon_print( const bcon *bc );
    

BSON generation

  • Finite State Machine
    • 5 states, 8 lexical token types, 16 transitions
    • augmented with stacks, not recursion


Performance

1.1 to 1.2 times

compiler optimization -O3 - important

Future Write Concern Improvement


    typedef struct mongo_write_concern {
        int w;            /**< Number of total replica write copies to complete including the primary. */
        int wtimeout;     /**< Number of milliseconds before replication timeout. */
        int j;            /**< If non-zero, block until the journal sync. */
        int fsync;        /**< Same a j with journaling enabled; otherwise, call fsync. */
        const char *mode; /**< Either "majority" or a getlasterrormode. Overrides w value. */

        bson *cmd; /**< The BSON object representing the getlasterror command. */
    } mongo_write_concern;

    void mongo_write_concern_init( mongo_write_concern *write_concern );

    int mongo_write_concern_finish( mongo_write_concern *write_concern );

    void mongo_write_concern_destroy( mongo_write_concern *write_concern );

    void mongo_set_write_concern( mongo *conn, mongo_write_concern *write_concern );

    int mongo_insert( mongo *conn, const char *ns, const bson *data, mongo_write_concern *custom_write_concern );

    int mongo_insert_batch( mongo *conn, const char *ns, const bson **data, int num, mongo_write_concern *custom_write_concern, int flags );

    int mongo_update( mongo *conn, const char *ns, const bson *cond, const bson *op, int flags, mongo_write_concern *custom_write_concern );

    int mongo_remove( mongo *conn, const char *ns, const bson *cond, mongo_write_concern *custom_write_concern );
    
  • no tag support
  • additional code and documentation


replace with bson/bcon to eliminate all of the above

Conclusions

  • Easy BSON construction for development and maintenance
  • Reference and Pointer Interpolation
  • Enables cleanup and simplification of the C driver
  • Usability - better, easier to use C driver


Discussion

  • Extensions
    • function interpolation experiment - too complex?
  • Other Drivers/Languages?
  • Feedback