The Harlinn.Common.Core library includes C++ classes that makes it easy to use MicroSofts’ Extensible Storage Engine (ESE) in your apps. The classes are designed to be used in performant, highly concurrent, server development.

Background

Several years ago, while working on a system that stored, retrieved, analyzed large amounts of timeseries data in an Oracle database, we found that it was impossible to meet the performance requirements set for the project using the existing database schema as deployed on an Oracle RDBMS. Changing the database schema was not an option since that would also require changes to several large and critical systems.

After trying out several NoSQL database engines we found that MicroSofts’ Extensible Storage Engine (ESE) was a good fit for our needs.

Our main reasons for choosing ESE was that:

  • It’s installed with Windows, so it gets updates and bugfixes along with the rest of the OS, and it is used by well known Microsoft products such as:

    • Microsoft Exchange
    • Active Directory
    • Windows Search
  • It is reliable.
  • It’s fast - inserting more than 100 000 records per second is not unusual.
  • There are no additional costs.

This article shows how to:

The Extensible Storage Engine has a C based API, and after a while we decided that it would be beneficial for the progress of the project the create a C++ wrapper around the C API to simplify the code base and reduce programming errors. This turned out to be an excellent idea, as it really simplified and reduced the size of the ESE related code.

The C++ wrappers that Harlinn.Common.Core provides for the ESE C API, are a complete rewrite and an improvement compared to the wrappers I developed for that project, since:

  • They work nicely with the standard C++ libraries.
  • Use move semantics to safely and efficiently manage the ESE resources.

To use the C++ wrappers for Extensible Storage Engine you need to:

  • Include the HCCEse.h header file and link with the Harlinn.Common.Core.lib library.
  • Initialize and configure an instance.
  • Create a session.
  • Create or open a database in this session.

Here is an example that does this, and:

  • Creates a table.
  • Adds a boolean column.
  • Inserts two rows of data
  • Reads the two rows of data from the table and verifies that the retrieved data is as expected.
// Initialize the Extensible Storage Engine
Ese::Instance instance( L"TestInstance" );

// Set the path to the directory that will contain 
// the checkpoint file for the instance 
instance.SetSystemPath( DatabaseSystemPath );

// Set the path to the directory that will contain 
// the transaction log files for the instance 
instance.SetLogFilePath( DatabaseLogfilePath );

// Create the session
auto session = instance.BeginSession( );

// Create or open the database overwriting any 
// existing data
auto database = session.CreateDatabase( DatabasePath, 
                        Ese::CreateDatabaseFlags::OverwriteExisting );

// Create a table
auto table1 = database.CreateTable( "table1" );

// Add a column
auto columnId = table1.AddBoolean( "fieldName" );

// A lambda funtion that inserts one row in the table.
auto DoInsert = [ &table1, columnId ]( )
    {
        // Insert a new row
        table1.Insert( );
        // Assign a value to the column
        table1.SetColumn( columnId, true );
        // Write the data to the table
        table1.Update( );
    };

// Insert two rows of data
DoInsert( );
DoInsert( );
    
// Move the table cursor to the first row
if ( table1.MoveFirst( ) )
{
    // Retrieve the column value as a std::optional<bool>
    auto value1 = table1.As<bool>( columnId );

    // Move the table cursor to the next row
    if ( table1.MoveNext( ) )
    {
        auto value2 = false;
        // Retrieve the column value by passing a reference
        // to the variable that will be assigned the column
        // value 
        auto columnNotNull = table1.Read( columnId, value2 );

        // Verify that the data is as expected
        assert( columnNotNull == value1.has_value( ) == true );
        assert( value2 == value1.value( ) == true );

        // Verify that there are no more rows
        assert( table1.MoveNext( ) == false );
    }
}

If anything goes wrong, the library will throw an exception, so there is no need for elaborate error checking at each call site, but those exceptions must be handled.

The test code, in EseTests.cpp, which is included with the source code, demonstrates how to store and retrieve all datatypes supported by ESE.

There is also an example that demonstrates the performance of ESE, on my development laptop it:

  • Inserts 44 640 000 rows in 286 seconds, at a rate of 156 005 rows per second. This done by first searching for a row that matches the primary key. If found, the row is updated, otherwise a new row is inserted into the table.
  • Reads 44 640 000 rows in 30 seconds, at a rate of 1 479 087 rows per second.
  • Searches for, and retrieves 1 000 subsets of the data, for a total of 21 600 000 rows, in 27 seconds, at a rate of 794 690 rows per second.

The above are excellent numbers, and to see how this works out on your own system, build and execute:

ExampleCoreESE01.exe -c -r -t -d F:\<Path to database directory>\Database.edb

where

  • -c tells the program to create a new database
  • -d provides the path to the database file
  • -r tells the program to replace any existing database at that location
  • -t tells the program to execute the performance tests

This requires about 3GB of free space on your hard drive.

The library allows you to use both UTF16, and char based strings by providing overloads that use the Unicode version and the ANSI version of the ESE C API, and selects one or the other based on the string/character type of the arguments were this is applicable.

Harlinn::Common::Core::Ese

There are four major classes in the library that wraps the ESE C API handle types. The classes are move assignable and move constructible, but not copy assignable, and not copy constructible. This design is used to ensure that the lifetime of the handles is convenient to manage in an appropriate way, ensuring that the handles are always closed when the owning objects go out of scope.

Instance

Instance is the root object, and all use of the library should start by creating an instance of this type.

The Instance object holds and manages the lifetime of an ESE instance handle.

Session

A Session object holds and manages the lifetime of an ESE session handle. Sessions provide the transactional context of ESE, and all ESE database operations are performed through a session.

When an instance is used by more than one thread, each thread should have its own session.

Database

A Database object holds and manages the lifetime of a database handle. ESE database handles are used to manage the schema of the database and to manage tables inside the database. Database handles can only be used with the session that was used to create them, and the library helps to facilitate this.

Table

A Table object holds and manages the lifetime of a database cursor handle. ESE cursor handles are used to read row data, search rows; or create, update and delete rows. It is also used to define the columns and indexes for a table. Like database handles, the database cursor handles can only be used with the session that was used to create them, and the library helps to facilitate this too.

First steps

Normally, an application will have a single Ese::Instance object, and multiple sessions with one or more open databases in them. To keep the code for the unit tests simple, I bundled them into a single class called Engine that is used for most of them:

class Engine
{
public:
    Ese::Instance instance;
    Ese::Session session;
    Ese::Database database;

    Engine( bool createNewDatabase = true )
    {
        instance = Ese::Instance( "TestInstance" );
        
        // Configure ESE to create any required directory that is 
        // missing in a file system silently
        instance.SetCreatePathIfNotExist( );

        // Configure ESE to not display a dialog box in 
        case of an error.
        instance.SetExceptionAction( Ese::ExceptionAction::None );
        
        // Set the path to the directory that will contain 
        // the checkpoint file for the instance
        instance.SetSystemPath( DatabaseSystemPath );

        // Set the path to the directory that will contain 
        // the transaction log files for the instance 
        instance.SetLogFilePath( DatabaseLogfilePath );

        // Create the session
        session = instance.BeginSession( );
        if ( createNewDatabase )
        {
            // Create a new database, overwriting any existing
            // database.
            database =
               session.CreateDatabase( DatabasePath, 
                                       Ese::CreateDatabaseFlags::OverwriteExisting );
        }
        else
        {
            // Attach the database to the session
            session.AttachDatabase( DatabasePath );

            // Open the database
            database = session.OpenDatabase( DatabasePath );
        }
    }
};

If createNewDatabase is false, an instance of the Engine class will open an existing database.

The call to instance.SetCreatePathIfNotExist( ) configures ESE to create any required directory that is missing in a file system silently.

instance.SetExceptionAction( Ese::ExceptionAction::None ) configures ESE to not display a dialog box in case of error.

instance.SetSystemPath( DatabaseSystemPath ) configures the path to the directory that will contain the checkpoint file for the instance.

instance.SetLogFilePath( DatabaseLogfilePath ) configures the path to the directory that will contain the transaction logs for the instance.

When the instance is configured, we’re ready to create the Session object:

session = instance.BeginSession( );

Once we have a Session object, we can create new databases by calling Session::CreateDatabase, or we can attach and open an existing database, as shown above.

With this in place, creating a new table is as easy as:

BOOST_AUTO_TEST_CASE( CreateTableTest1 )
{
    Engine engine;
    auto& database = engine.database;
    auto table1 = database.CreateTable( "table1" );
    BOOST_TEST( table1.IsValid( ) );
}

Trivial, so lets look at something useful. Lets say you have the following struct and want to store the data in a table:

struct SensorValue
{
    // Identifies the sensor
    boost::uuids::uuid Sensor;

    // The timestamp for the sensor value
    std::chrono::system_clock::time_point Timestamp;

    // The flags value 
    int64_t Flags = 0;

    // The sensor value
    double Value = 0.0;
};

To keep everything in one place, we extend the Engine class:

class SensorEngine : public Engine
{
public:
    using Base = Engine;

When working with ESE I like to keep names short, particularly the column names as this simplifies index creation.

We need a name for the table:

    static constexpr char SensorValueTableName[] = "SV";

A name for the primary index:

    static constexpr char SensorValueIndexName[] = "ISV";

A name for the sensor column:

    static constexpr char SensorColumnName[] = "S";

A name for the timestamp column

    static constexpr char TimestampColumnName[] = "T";

A name for the flags column

    static constexpr char FlagsColumnName[] = "F";

And, finally a name for the value column:

    static constexpr char ValueColumnName[] = "V";

When working with columns in ESE we identity them by their column ids, so we need four column ids, one for each column:


    JET_COLUMNID SensorColumnId = 0;
    JET_COLUMNID TimestampColumnId = 0;
    JET_COLUMNID FlagsColumnId = 0;
    JET_COLUMNID ValueColumnId = 0;

We also need a table object:

    Ese::Table SensorValues;

The default constructor for Ese::Table creates an object that is not connected to an ESE table.

We could skip writing the constructor, since it doesn’t add anything, but I think implementing it makes the intent of the code clearer:

    SensorEngine( bool createDatabase = true )
        : Base( createDatabase )
    {
    }

Create a Table

Creating a table is rather straight forward:

    void CreateSensorValueTable( )
    {

First we need to start a transaction, which will be rolled back automatically unless we call transaction.Commit( ) when we are done:

        auto transaction = session.StartTransaction( );

Then we create the table:

        SensorValues = database.CreateTable( SensorValueTableName );

Before adding the four columns:

        SensorColumnId = SensorValues.AddGuid( SensorColumnName );
        TimestampColumnId = SensorValues.AddDateTime( TimestampColumnName );
        FlagsColumnId = SensorValues.AddUInt64( FlagsColumnName );
        ValueColumnId = SensorValues.AddDouble( ValueColumnName );

The various AddColumnType functions return the column id for the columns they create.

Now that we have created the table, it’s time to create the primary key, which is an index with the Ese::IndexFlags::Primary flag set. Every table should have a primary key, and if we don’t create one, ESE will create one for us, but then we don’t have control over its definition.

This "+S\0+T\0" is the definition we would like to have for the primary key of the table. Here S and T are the column names, and the + prefix before the column names tells ESE that want the index in ascending order for that column.

The index field definition for each column is required to be terminated by a \0 value, and the last field definition must be terminated by a double \0.

The last parameter to the CreateIndex function 7 is the number of characters in the index definition, including the final \0 that C++ adds to zero terminated strings.

        SensorValues.CreateIndex( SensorValueIndexName, Ese::IndexFlags::Primary, "+S\0+T\0", 7 );

We are done, and can commit the transaction.

        transaction.Commit( );
    }

Opening a Table

Opening a Table is easy:

    void OpenSensorValueTable( )
    {

First we call Database::OpenTable:

        SensorValues = database.OpenTable( SensorValueTableName );

Then we call Table::GetColumnId to get the column ids for each of the four columns:

        SensorColumnId = SensorValues.GetColumnId( SensorColumnName );
        TimestampColumnId = SensorValues.GetColumnId( TimestampColumnName );
        FlagsColumnId = SensorValues.GetColumnId( FlagsColumnName );
        ValueColumnId = SensorValues.GetColumnId( ValueColumnName );
    }

Create, Retrieve, Update and Delete (CRUD)

Now that we have a table to work with, we can implement functions to create, retrieve, update and delete rows from the table.

Creating a new row

Inserting a new row starts with calling the Table::Insert function, before calling one of the Table::SetColumn overloads for each of the columns, and finalizing the operation with a call to Table::Store.

    void Insert( const SensorValue& value )
    {
        SensorValues.Insert( );
        SensorValues.SetColumn( SensorColumnId, value.Sensor );
        SensorValues.SetColumn( TimestampColumnId, value.Timestamp );
        SensorValues.SetColumn( FlagsColumnId, value.Flags );
        SensorValues.SetColumn( ValueColumnId, value.Value );
        SensorValues.Store( );
    }

Read column data

Reading column values is performed by calling one of the Table::Read overloads. The Table::Read overloads returns false if the column value is NULL, but since none of the columns can be set to NULL, the return value can be safely ignored. First we create a function that reads the value stored in the S column:

    boost::uuids::uuid Sensor( ) const
    {
        boost::uuids::uuid result;
        SensorValues.Read( SensorColumnId, result );
        return result;
    }

Then we create a function that reads the value stored in the T column:

    std::chrono::system_clock::time_point Timestamp( ) const
    {
        std::chrono::system_clock::time_point result;
        SensorValues.Read( TimestampColumnId, result );
        return result;
    }

Afterwards we create a function that reads the value stored in the F column:

    int64_t Flags( ) const
    {
        int64_t result;
        SensorValues.Read( FlagsColumnId, result );
        return result;
    }

Finally we create a function that reads the value stored in the V column:

    double Value( ) const
    {
        double result;
        SensorValues.Read( ValueColumnId, result );
        return result;
    }

To bring it all together we can wrap it up by creating a Retrieve function that returns our SensorValue object.

    SensorValue Retrieve( )
    {
        SensorValue result{ Sensor( ), Timestamp( ), Flags( ), Value( ) };
        return result;
    }

Searching

To place the table cursor on the first row that contains data for a sensor, which is the one with the earliest timestamp, we first create a search key using the one of the Table::MakeKey function overloads. Search keys are used to search for a row of data matching an index. When we have one or more secondary indexes defined for a table, we use the Table::SetCurrentIndex function to select the active search index for the table. Since we have only created a primary index for the table, it will always be the search index. The S column, which identifies the sensor, is the first member of the index, and the T column is the second and final member of the index.

By passing the sensor id to Table::MakeKey along with Ese::KeyFlags::NewKey to indicate that this is the first value of a new sear key, and Ese::KeyFlags::FullColumnStartLimit indicates this is the last value we will specify for the search key, and that it will be used to match the index entry closest to the start of the index that satisfies the key.

In other words, we create a that will only look at the first column of the index when looking for matches with Table::Seek or Table::SetIndexRange.

    bool MoveTo( const boost::uuids::uuid& sensorId ) const
    {
        SensorValues.MakeKey( sensorId, Ese::KeyFlags::NewKey | 
                                        Ese::KeyFlags::FullColumnStartLimit );

Next we call Table::Seek passing Ese::SeekFlags::GreaterOrEqual as its argument because this will place the cursor on the row closest to the beginning of the active search index that matches the current search key.

        auto rc = SensorValues.Seek( Ese::SeekFlags::GreaterOrEqual );
        return rc >= Ese::Result::Success;
    }

The MoveTo function can be used to iterate over the stored sensor values for a sensor like this:

success = engine.MoveTo( firstSensor );
if ( success )
{
    do
    {
        auto sensorId = engine.Sensor( );
        if ( sensorId != firstSensor )
        {
            break;
        }
    } while ( sensorValues.MoveNext( ) );
}

Next we create a function that locates the sensor value, for a sensor, with the greatest timestamp less or equal to the argument timestamp. In other words, the current sensor value at the requested time.

    bool MoveTo( const boost::uuids::uuid& sensorId, 
        const std::chrono::system_clock::time_point& timestamp, 
        bool requireFullMatch = false ) const
    {

This time we need to create a search key with two entries, one for the sensor id, and one for the timestamp.

        SensorValues.MakeKey( sensorId, Ese::KeyFlags::NewKey );
        SensorValues.MakeKey( timestamp );

By default we pass Ese::SeekFlags::LessOrEqual to Table::Seek because this will place the cursor on the row that compares less of equal to to the search key that is closest to the end of the search index. When requireFullMatch is true, it’s because we want to move to a record that exactly matches the search key.

        auto rc = SensorValues.Seek( requireFullMatch ? 
                Ese::SeekFlags::Equal : Ese::SeekFlags::LessOrEqual );
        return rc >= Ese::Result::Success;
    }

Implementing a function that places the cursor on the record with the latest timestamp, is similar to creating the bool MoveTo( const boost::uuids::uuid& sensorId ) const function:

    bool MoveToLast( const boost::uuids::uuid& sensorId ) const
    {
        SensorValues.MakeKey( sensorId, Ese::KeyFlags::NewKey | 
                            Ese::KeyFlags::FullColumnEndLimit );

        auto rc = SensorValues.Seek( Ese::SeekFlags::LessOrEqual );
        return rc >= Ese::Result::Success;
    }

It would be nice if we could simplify the code used to iterate over the sensor values, like this:

success = engine.Filter( secondSensor );
if ( success )
{
    do
    {
    } while ( sensorValues.MoveNext( ) );
}

This can be accomplished using the Table::SetIndexRange function.

    bool Filter( const boost::uuids::uuid& sensorId ) const
    {
        auto success = MoveTo( sensorId );
        if ( success )
        {
            SensorValues.MakeKey( sensorId, Ese::KeyFlags::NewKey | 
                                Ese::KeyFlags::FullColumnEndLimit );

Calling Table::SetIndexRange with Ese::IndexRangeFlags::UpperLimit creates a limit for the records that can be navigated to using Table::MoveNext or Table::Move( JET_MoveNext ), while Ese::IndexRangeFlags::Inclusive indicates that the row matching the search key is included in the set of rows that the cursor can navigate to when this index range is active.

            SensorValues.SetIndexRange( Ese::IndexRangeFlags::Inclusive | 
                                Ese::IndexRangeFlags::UpperLimit );
            return true;
        }
        else
        {
            return false;
        }
    }

Similarly, it would be nice to be able to move backwards over the index:

success = engine.ReverseFilter( secondSensor );
if ( success )
{
    do
    {
    } while ( sensorValues.MovePrevious( ) );
}

ReverseFilter can be implemented like this:

    bool ReverseFilter( const boost::uuids::uuid& sensorId ) const
    {
        auto success = MoveToLast( sensorId );
        if ( success )
        {
            SensorValues.MakeKey( sensorId, Ese::KeyFlags::NewKey | Ese::KeyFlags::FullColumnStartLimit );
            SensorValues.SetIndexRange( Ese::IndexRangeFlags::Inclusive );
            return true;
        }
        else
        {
            return false;
        }
    }

Updating a Row in a Table

To update a row of data in the table we must first navigate to the row that we want to update, which is why we have the requireFullMatch parameter for

    bool MoveTo( const boost::uuids::uuid& sensorId, 
        const std::chrono::system_clock::time_point& timestamp, 
        bool requireFullMatch = false ) const

This overload of MoveTo will return false if it cannot find a row where S is equal to sensorId and T is equal to timestamp when true is passed for the requireFullMatch parameter.

    bool Update( const SensorValue& value )
    {

So first we locate the row:

        if ( MoveTo( value.Sensor, value.Timestamp, true ) )
        {

Then call Table::Replace to prepare the table for an update to the columns that are not part of the primary index.

            SensorValues.Replace( );

Assign the values that we want to write to the columns of the row:

            SensorValues.SetColumn( FlagsColumnId, value.Flags );
            SensorValues.SetColumn( ValueColumnId, value.Value );

And finalize the update with a call to Table::Store:

            SensorValues.Store( );
            return true;
        }
        return false;
    }

Deleting a Row in a Table

To delete a row in a table we must first position the cursor on the row we want to delete, and then call Table::Delete

    bool Delete( const SensorValue& value )
    {
        if ( MoveTo( value.Sensor, value.Timestamp, true ) )
        {
            SensorValues.Delete( );
            return true;
        }
        return false;
    }

That’s it, end of class :-)

};