Finding Tech

Always find what I found
Recent Tweets @

GRSecurity

For over the past decade, grsecurity has provided webhosting companies and other users of Linux the highest level of security available for any mainstream OS.

Unlike other expensive security “solutions” that pretend to achieve security through known-vulnerability patching, signature-based detection, or other reactive methods, grsecurity provides real proactive security. The only solution that hardens both your applications and operating system, grsecurity is essential for public-facing servers and shared-hosting environments.

Only grsecurity provides protection against zero-day and other advanced threats that buys administrators valuable time while vulnerability fixes make their way out to distributions and production testing.

Add increased authentication for administrators, audit important system events, and confine your system with no manual configuration through advanced Role-Based Access Control.

Use Trusted Path Execution to prevent users from executing their own binaries or binaries in unsafe locations.

Invisibly reinforce the most common filesystem isolation, turning it into a true jail.

Through partnership with the PaX project, creators of ASLR and many other exploit prevention techniques — some now imitated by Microsoft and Apple, grsecurity makes many attacks technically and economically infeasible by introducing unpredictability and complexity to attempted attacks, while actively responding in ways that deny the attacker another chance.

Available for free under the GNU GPL version 2 with commercial support and the opportunity to sponsor our work, grsecurity brings you the security of the next decade, today.

Boxen

Boxen is your team’s IT robot. It’s a dangerously opinionated framework that automates every piece of your development environment. GitHub, Inc. wrote the first version of Boxen (imaginatively called “The Setup”) to help employees start shipping on day one. It’s configuration management for everyone: Designers, HR mavens, legal eagles, and developers. We believe that development is production, so we value consistency, predictability, and reproducibility over artisanal, hand-tweaked development environments.

We ditched The Setup and wrote Boxen so it’s easily usable by any company, not just GitHub. We’ve extracted most Boxen features into modules that can be mixed and matched to create your perfect environment, and custom behavior is always just a module away.

Astyanax

Astyanax is a Java Cassandra client library. Astyanax was the son of Hector in Greek mythology. As such, Astyanax is a refactoring of Hector into a cleaner abstraction for the connection manager and a simpler API.

Astyanax provides a complete abstraction of the connection pool implementation from the API layer. Some key features include,

  1.     Automatic failover with context
  2.     Pinning request to a specific host
  3.     Host partitions based on token ranges
  4.     Pluggable latency tracking strategy
  5.     Pluggable host selection (ex. Round Robin, Lowest latency first)
  6.     Pluggable bad host detector to determine when to mark a host as down (ex. if it times out too frequently)
  7.     Pluggable monitor interface. There is no logging inside the connection pool.
  8.     Pluggable host retry backoff strategy.
  9.     Pluggable node discovery strategy. Can use ring_describe or custom node registry service.
  10.     Minimal use of synchronized by using non-blocking data structures

Provided implementations

  1.     Basic round robin
  2.     Token aware
  3.     Bag of connections

Bamboo

bamboo is an application that systematizes realtime data analysis. bamboo provides an interface for merging, aggregating and adding algebraic calculations to dynamic datasets. Clients can interact with bamboo through a REST web interface and through Python.

bamboo supports a simple querying language to build calculations (e.g. student teacher ratio) and aggregations (e.g. average number of students per district) from datasets. These are updated as new data is received.

bamboo uses pandas for data analysis, pyparsing to read formulas, and mongodb to serialize data.

bamboo is open source software released under the 3-clause BSD license, which is also known as the “Modified BSD License”.

Suro

Suro is a data pipeline service for collecting, aggregating, and dispatching large volume of application events including log data. It has the following features:

  • It is distributed and can be horizontally scaled.
  • It supports streaming data flow, large number of connections, and high throughput.
  • It allows dynamically dispatching events to different locations with flexible dispatching rules.
  • It has a simple and flexible architecture to allow users to add additional data destinations.
  • It fits well into NetflixOSS ecosystem
  • It is a best-effort data pipeline with support of flexible retries and store-and-forward to minimize message loss

Harvest, clinically mine data.

MADlib

MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.

The MADlib mission: to foster widespread development of scalable analytic skills, by harnessing efforts from commercial practice, academic research, and open-source development.

OpenRefine

OpenRefine is a power tool that allows you to load data, understand it, clean it up, reconcile it to master database, and augment it with data coming from Freebase or other web sources. All with the comfort and privacy of your own computer.

RocksDB

RocksDB is an embeddable persistent key-value store for fast storage. RocksDB can also be the foundation for a client-server database but our current focus is on embedded workloads.

RocksDB builds on LevelDB to be scalable to run on servers with many CPU cores, to efficiently use fast storage, to support IO-bound, in-memory and write-once workloads, and to be flexible to allow for innovation