News

DataPelago Unveils World’s First Universal Data Processing Engine

Backed by $47 Million in Funding, the Company’s Engine Accelerates Data Processing for GenAI and Lakehouse Analytics

Oct 01, 2024

Share:

DataPelago today unveiled a revolutionary Universal Data Processing Engine to accelerate any engine, including open source, on any hardware, using any data type. DataPelago’s engine enables organizations to extract value from data at unprecedented price and performance for their GenAI and analytics workloads. The company is launching from stealth with $47 million in funding from Eclipse, Taiwania Capital, Qualcomm Ventures, Alter Venture Partners, Nautilus Venture Partners, and Silicon Valley Bank, a division of First Citizens Bank.


Traditional processing solutions based on CPUs and today’s software architectures cannot handle the complexity and volume of data, doubling every two years, with unstructured data now accounting for 90% of all data created. The surge of GenAI and its dependence on huge volumes of unstructured data is compounding the processing challenge. DataPelago is creating a new data processing standard for the accelerated computing era to overcome these performance, cost and scalability limitations.


“Today, organizations are faced with an insurmountable barrier to unlocking breakthrough intelligence and innovation: processing an endless sea of data,” said DataPelago co-founder and CEO, Rajan Goyal. “We created DataPelago to address this critical need. By applying nonlinear thinking to overcome data processing’s current limits, we’ve built an engine capable of processing exponentially increasing volumes of complex data across varied formats, making it possible for organizations to truly realize the value of their data.” 


DataPelago's Universal Data Processing Engine is available as an end-to-end solution or in integration with Substrait-based open source frameworks to turbocharge Spark and Trino with accelerated computing. It provides customers disruptive price/performance advantages without any change in application or workflows. DataPelago seamlessly integrates into existing data stores and lakehouse platforms, eliminating the need for data migration and avoiding vendor lock-in.


“The exponential growth of semi-structured and unstructured data along with rapid Gen AI/AI adoption is driving innovation, not only in AI, but in data management and data processing,” said Steve Grobman, Executive VP and CTO, McAfee, a DataPelago design partner. “McAfee has been proud to partner with DataPelago on the design of their technology that shows promising results, including significant performance and cost improvements on certain workloads.”


The DataPelago engine has an innovative architecture comprised of three layers that together combine to process data one to two orders of magnitude faster than today’s query engines.


  DataVM - the industry’s first virtual machine with a domain-specific Instruction Set Architecture (ISA) for data operators providing a common abstraction for execution on accelerated computing elements, spanning CPU, GPU, FPGA, and custom silicon.


  DataOS - the operating system layer that maps data operations to heterogeneous accelerated computing elements and manages them dynamically to optimize performance at scale.


  DataApp - a pluggable layer that enables integration with platforms including Spark and Trino to deliver acceleration capabilities to these engines.


“Partnering with DataPelago exemplifies our dedication to innovating for exceptional customer service,” said André Fichel, CTO at Akad Seguros, an early DataPelago customer. “DataPelago's engine allows us to unify our GenAI and data analytics pipelines by processing structured, semi-structured, and unstructured data on the same pipeline while reducing our costs by more than 50%.” 


DataPelago’s engine is uniquely suited for use cases that are resource intensive, such as analyzing billions of transactions while ensuring data freshness, supporting AI-driven models to detect threats at wire-line speeds across millions of consumer and data center endpoints, and providing a scalable platform to facilitate the rapid deployment of training, fine-tuning and RAG inference pipelines. 


Co-founder and CEO Rajan Goyal has over 20 years of experience building accelerated computing solutions across domains such as security, data movement, and data storage. With DataPelago, Goyal has assembled a multi-disciplinary team with decades of experience across system, architecture, data analytics, cloud SaaS, open source development, and more to shatter the limits that data processing faces today in performance, cost, and scalability. 


“When data can be extracted as quickly as it’s generated, businesses can harness insights to make better decisions and operate more efficiently,” said Lior Susan, CEO and Founding Partner at Eclipse and a DataPelago board member. “DataPelago’s universal data processing engine represents a paradigm shift that will unlock new possibilities in the worlds of supply chain, sustainable energy, the medical field, and beyond.” 


"DataPelago's foresight to cleverly architect their engine to be processing unit agnostic, including GPUs, positions them as an undisputed leader in data acceleration," said Cheng Wu, General Partner of Tech fund at Taiwania Capital and a DataPelago board member. "DataPelago has a visionary founder, a top-notch team, and a track record of proven results to support their claims at every stage of their journey in the new Data + AI world."