Two-dimensional sparse bitmap implementation for Node.js

Mirror of https://github.com/electric-sheep-co/2d-sparse-bitmaps-node/
https://www.npmjs.com/2d-sparse-bitmaps
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Ryan Joseph 9ab38de5fd Bump to v0.1.0 3 months ago
.github/workflows More README work 3 months ago
cli New: Initial 2dsb-cli implementation 3 months ago
stores Bump to v0.1.0 3 months ago
test Update: benchmark with new options and some fixes 3 months ago
tools New: Initial 2dsb-cli implementation 3 months ago
.eslintrc.json Initial commit 3 months ago
.gitignore Rudimentary ioredis test 3 months ago
LICENSE Create LICENSE 3 months ago
README.md New: Initial 2dsb-cli implementation 3 months ago
constants.js Refactor 3 months ago
impl.js Refactor 3 months ago
index.js Update: benchmark with new options and some fixes 3 months ago
package-lock.json Bump to v0.1.0 3 months ago
package.json Bump to v0.1.0 3 months ago

README.md

2d-sparse-bitmaps

CI Status Dependencies Dev Dependencies

A two-dimensional sparse bitmap implementation for Node.js 10 or later, with no required dependencies.

Allows for flexible backing store choice, the primary supported being Redis (via ioredis).

The underlying chunked implementation is efficient; the following example needs only 64 bytes to represent two coordinates which are ~1,414,213 units distant each other on the diagonal:

const TwoD = require('2d-sparse-bitmaps');
const bitmap = new TwoD.SparseBitmap({ [TwoD.ChunkWidthKey]: 16 });

await bitmap.set('so-far-away', 0, 0);
await bitmap.set('so-far-away', 1e6, 1e6);

Additionally, the sparse nature of the data structure allows for performant queries within specified bounds via inBounds().

Installation

$ npm install 2d-sparse-bitmaps

Examples

The included tests, benchmarking tool & cli hopefully provide ample usage examples, and in the case of the cli an easily-accessibly interactive means of working with the library.

Usage

All coordinates must be unsigned, a limitation that may be removed in future releases.

Instantiation

With the default in-memory store:

const TwoD = require('2d-sparse-bitmaps');
const bitmap = new TwoD.SparseBitmap();

With an ioredis instance:

const Redis = require('ioredis');
const TwoD = require('2d-sparse-bitmaps');

const rConn = new Redis();
const bitmap = new TwoD.SparseBitmap({ [TwoD.BackingStoreKey]: rConn });

Full options

Constant Name Description Default Restrictions
ChunkWidthKey The width in bits of each chunk in the sparse bitmap 128 >= 8, must be a multiple of 8
KeyPrefixKey The string preprended to each key before being passed onto the backing store. sparse-bitmap none
BackingStoreKey The backing store instance to be used. InMemoryStore Must conform to the aforementioned interface.

Each chunk requires up to (X / 8) * X bytes of storage, where X is the chosen chunk width. Accordingly, the default chunk width of 128 requires 2048 bytes per chunk.

Backing store interface

Any backing store must implement this interface:

getbit(key, bitPosition);
setbit(key, bitPosition, value);
getBuffer(key);

It may optionally implement pipeline(), which must return an instance implementing the aforementioned interface plus exec() for pipeline execution; additionally, the backing store interface methods of this instance must accept an additional callback argument of type function (err, result).

The default InMemoryStore provides an example implementation sans pipeline() et. al.

Get:

Cannot be called within pipelinedMutate() context. No option to pipeline these calls is available; high-volume searches should be performed with inBounds() instead.

The bit at (x, y) in key:

const xySet = await bitmap.get(key, x, y);

Set:

The bit at (x, y) in key:

await bitmap.set(key, x, y);

Unset:

The bit at (x, y) in key:

await bitmap.unset(key, x, y);

Get all set-bits in given bounds:

Cannot be called within pipelinedMutate() context. This method performs it’s own internal pipelining to be as performant as possible.

Within the bounding box definition - here named bBox - from is the top-left coordinate and to is the bottom-right coordinate:

const bBox = {
  from: { x: …, y: … },
  to: { x: …, y: … }
};

const allInBounds = await bitmap.inBounds(key, bBox);

allInBounds will be a list of two-element lists (tuples), where the x coordinate is the first value ([0]) and y is the second ([1]).

Has a third optional parameter, strict, which if set to true will cull the list before return to only include points strictly within the given bounding box; otherwise, points within any chunk intersected by the bounding box will be returned.

Get a key-bound instance:

The returned instance has the same interace as above except that all methods no longer take the key argument, as tgey are all now bound to key specifically:

const occupiedBitmap = bitmap.boundToKey('occupied');
await occupiedBitmap.set(x, y);
const check = await occupiedBitmap.get(x, y);
await occupiedBitmap.unset(x, y);
const occupiedInBounds = await occupiedBitmap.inBounds({ … });

Pipelined mutation:

When executing many mutations (set() and unset()) in high volume and/or frequency, the pipelineMutate() method should be used to provide a context in which any calls to these mutators - including methods of instances produced by boundToKey() - will be pipelined appropriately and therefore executed as an atomic unit:

const bitmap = new TwoD.SparseBitmap({ [TwoD.BackingStoreKey]: new Redis() });
const keyBound = bitmap.boundToKey('foobar');

const scopeReturn = await bitmap.pipelinedMutate(async () => {
  for (…) {
    await bitmap.set('foobar', x, y);
  }

  for (…) {
    await keyBound.unset(x, y);
  }
});

Upon return of the scoped function, the pipeline will be executed (with the results being discarded) and the result of the scoped function returned by pipelinedMutate().

When used with a backing store that does not support pipelining, the mutators are executed normally (against the store at call-time), therefore this method may be used with any backing store regardless of its pipelining capability or lack thereof.

Building a large pipeline may cause significant runtime memory pressure and as such has the potential to cause out-of-memory conditions. Consumers are advised to create reasonably-sized pipelines, though the author does not (yet) provide guidance as to what qualifies “reasonably-sized”.

The accessor methods of SparseBitmap - get() and inBounds() - cannot be called within a pipelined context! Doing so is a programming error and will result in an exception being thrown.

Contributing

Any and all contributions are welcome and the project is accepting of pull requests at all times.

Testing

The full test suite (including linting) is run via:

$ npm test

If the NODE_ENV environment variable is set to ci, all tests against redis will be skipped entirely.

When running the tests against redis, the host is assumed to be the local machine on the standard port. Additionally:

  • REDIS_LOCAL_AUTH will be used as the connection password, if set.
  • REDIS_LOCAL_DB will select the redis DB to test within, if set.

Utilizing tape, each individual test file can be executed in isolation:

$ node test/default-store.js
…
# ok

Authors

Benchmarks

The following rudimentary benchmarks were performed on an AWS EC2 type m5a.large (4 vCPUs, 16GB RAM) running the Ubuntu 20LTS AMI (ID ami-03ceeb0f46ee38ce7). Both the backing store redis instance and benchmark script were run on this VM. All redis snapshotting was disabled. The EC2 instance was created specifically for this use and had no other workloads during the benchmarking process.

Version particulars:

  • Linux kernel 5.4.0-1020-aws
  • Node 14.6.0 (v8 8.4.371.19-node.12)
  • Redis 5.0.7 (636cde3b5c7a3923) (standalone)

The benchmark run consisted three runs of differing multipliers (1, 3 & 5), each 101 iterations of the full sequence defined in benchmark::T.seq.main().

The primary takeaway is the significant increase in performance afforded by pipelined operations.

According to this limited dataset, the mean search rate is ~1.2 Gbit/s (~150 MB/s), though it does depend on how sparsely-populated the search area is (with rates from ~493 Mbit/s - ~2 Gbit/s observed). Without pipelining enabled, this rate drops drastically to ~183 Mbit/s (a 6.5x drop).

The full data set is linked below, and the Google Sheet used to post-process and analyze the data is available here.

Raw data

License

MIT