This was the question that inspired Corterix CTO Paul Krneta, the former CTO of Sybase IQ and one of the world’s true experts on columnar databases, to develop Corterix.
Paul’s goal was to design and implement software that will ingest, index and load into a single SQL database unstructured and structured data from multiple sources simultaneously. All data loaded into the database would be able to be queried using SQL and through industry standard interfaces.
Hadoop was introduced in 2006. It was originally an unstructured data search capability, but during the past decade, it has morphed into a do-all for big data.
Hadoop is complex and requires new skills, new languages and tools. It’s touted as free but is very expensive. It requires new infrastructure, new tools, people with new skills, and consultants.
Given the unique needs for speed, massive scale, and security, Paul decided to embark on a different path. Although it would require a tremendous engineering effort, he decided to create a simple, minimally disruptive, unified data analytics solution on a SQL columnar database. His design took shape in 2000 and development began in 2009.
Paul decided to add unstructured data ingestion, indexing, loading and query to the structured equivalents in the database. Done correctly, a unified data analytics platform would enable business analysts, managers and executives to monitor business processes in new ways and find new business opportunities.
To accomplish the goal, the development team at Corterix created four new data models: documents, email/messages, multimedia, and sensors. Significant effort then went into the optimization of queries across all of the unstructured data models and structured data models.
Next came the design and implementation of a multi-channel non-blocking ETL required to address real-time high-performance ingestion. The result is a 4 million channel ETL that has been certified to 34.5 TB per hour in 2013. Data was ingested from thirty billion different sources. At the same time certification was granted for the largest unified data warehouse – 12 Petabytes.
Now it was time for optimization of simultaneous ETL and query. It was important that unified queries across petabytes of data return results in minimum time and do it concurrently with significant data ingestion over multiple channels, with transformation, indexing and loading.
The resulting CORTERIX Unified Data Analytics Solution achieved all of Paul’s design goals. CORTERIX sits on an X86 shared storage infrastructure. Servers and storage can be different sizes and different generations from different vendors.
CORTERIX enables the ingestion, transformation, indexing and loading of unstructured and structured data over up to 4 million channels into a single SQL database. All data loaded into the database can be queried using SQL, R and via industry standard interfaces immediately after data is committed by the SQL database.
Since CORTERIX enables a true data lake, all queries execute against all loaded data. No intermediate tables are required. No programming or integration is required. CORTERIX can also address embedded analytic use cases, such as intelligent archival.
The impact of CORTERIX is significant. CORTERIX puts analytics in the hands of experienced business analysts, managers and executives. It capitalizes on existing skills and personnel. Current query, report writing and visualization tools can be used. New tools that speak SQL of JDBC can be used immediately with no effort at integration.
Administration is easy for SQL database administrators. Data access is controlled using ACLs. CORTERIX infrastructure permits asymmetric upgrade of the underlying infrastructure. Servers can be added independent of storage to increase ETL and Query performance and without system interruption.
It’s a complete solution that delivers big results with no disruption – no new skills, languages, infrastructures, employees, and administrators. So it’s not a me-too solution. Corterix developers started from a different place. Our goal was to deliver a big data analytics solution that is a scalable, secure, single database repository that is immediately usable by all to analyze data at rest and streaming.
Design and implementation of it took two years. Once the design was right, functionality was achieved. Then the team focused on performance, scalability and availability.
The team sought out high-end big data challenges and PoCs to test functionality, performance and scalability thoroughly.
The results made it clear that users of Corterix can scale the repository to more than 12 PB and scale ingestion, indexing and loading of data into the single repository to more than 34 TB/hour.
What does Corterix mean to our prospects and customers? It’s an efficient solution that enables current and future big data analytic efforts with no disruption to your business, employees and technologies. Corterix enables all of the following:
Next generation inline, real-time big data queries. From a single database repository that is simultaneously indexing and loading vast amounts of unstructured and structured data.
Unmatched functionality. It enables batch and streaming ingestion, indexing and loading into a single database repository of just about any unstructured and structured data for immediate query, using intrinsic, not bolt-on, SQL, R and industry-standard APIs and interfaces.
Performance. Corterix scales performance asymmetrically to meet ETL and Query demands.
Scalability. With its asymmetric infrastructure scaling, you can add servers independently from storage. Or add storage independently from servers.
Beautiful Simplicity. Corterix can be introduced easily into existing and new big data analytic environments. All data ingested can be cross-correlated using SQL.
You can take advantage of tools that all your users are comfortable with to analyze and visualize all data in the repository.
Corterix eliminates the need for two repository, federated query investments. The benefits are immediate. Challenge us to demonstrate the breakthrough benefits of Corterix. Bring us your top-priority big data analytics challenges, and prepare to watch something you’ve never seen. The challenge met with beautiful simplicity.
Perry J. Narancic brings to Corterix leadership roles in the software industry, especially content management, archiving and electronic discovery. He is listed as an inventor on a number of pending US patent applications and is a founder of LexFusion Sofware, Inc. and a co-founder of AgroThrive, Inc. He has headed departments at iManage (NASDAQ: IMAN) and Verity, Inc. (NASDAQ: VRTY) and currently serves as outside general counsel to a number of privately-held companies. He has negotiated multi-million dollar licensing transactions with the likes of Oracle, Motorola and other large enterprises. Perry worked at Wilson Sonsini Goodrich and Rosati in Palo Alto, CA. He applies his own start-up experiences to assist Corterix particularly in licensing and intellectual property matters.
Paul is a recognized innovator in columnar database technology with over 25 years of experience in data analytics.
He served as CTO for Sybase IQ, where he architected the Multiplex option for IQ, optimized IQ for VLDB. He successfully certified the “World’s Largest Data Warehouse” in 2002 and 2005 in partnership with Sun Microsystems.
Paul also designed and certified a NonStopIQ option for Sybase IQ – in partnership with EMC, Hitachi and Sun. The option enables large DW installation near-instant recovery and DR site.
During his tenure at Digital Equipment Corp (‘DEC”) he served as the technical director for database technology and optimized Oracle to become the first 64-bit database capable of VLM (“Very Large Memory”). He also managed first 1 TB/hour on-line backup in 1996.
He led the design and implementation of first data warehouse for structured and unstructured data at BMMsoft – a certified 12 petabyte-size data warehouse and the largest data warehouse ever designed. As a bonus, it reduces storage usage by 90%, qualifying for the title “green data warehouse.”
Paul holds an Electronics Engineering Degree and a Masters Degree in Computer Science
Jim brings industry leadership to Corterix sales with over 30 years of experience in strategic business development and consulting, including sales leadership, software development, channel creation and development for global system integrators, and successful startups.
He has been a key executive stakeholder charged with sales and delivery management of revenue acceleration services and relationship development of global systems integrators and outsourcers. Jim leverages his long-established network to provide strategic growth consulting engagements to storage, software, and services companies.
Sales Principal, Industry Verticals, at VCE Manager Director of Optimal Strategies, an international management consulting firm EMC’s Lifetime Achievement Award and Individual Innovator Award.
Computer Science & Electrical Engineering | Syracuse University | Completed in 3 Years