Here are 3,679 public repositories matching this topic "big-data"
Repository Created on January 28, 2022, 12:58 pm
The streaming database: redefining stream processing 🌊. PostgreSQL-compatible, highly performant, scalable, elastic, and reliable ☁️.
Last updated on December 4, 2023, 4:39 am
Repository Created on December 5, 2022, 4:32 pm
YTsaurus is a scalable and fault-tolerant open-source big data platform.
Last updated on December 2, 2023, 10:01 pm
Repository Created on September 4, 2021, 2:29 am
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Last updated on December 4, 2023, 5:37 am
Repository Created on June 5, 2023, 2:52 am
TuGraph Analytics is an OLAP graph database.
Last updated on December 3, 2023, 2:26 am
Repository Created on February 25, 2014, 8:00 am
Apache Spark - A unified analytics engine for large-scale data processing
Last updated on December 4, 2023, 6:48 am
Repository Created on June 7, 2014, 7:00 am
Apache Flink
Last updated on December 4, 2023, 6:19 am
Repository Created on January 12, 2022, 3:13 am
Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
Last updated on December 4, 2023, 5:38 am
Repository Created on June 10, 2014, 7:00 am
Apache Parquet
Last updated on December 3, 2023, 11:43 pm
Repository Created on June 2, 2016, 8:28 am
ClickHouse® is a free analytics DBMS for big data
Last updated on December 4, 2023, 6:12 am
Repository Created on May 6, 2015, 7:00 am
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Last updated on December 3, 2023, 4:26 am
Repository Created on May 8, 2018, 9:22 pm
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Last updated on November 30, 2023, 8:24 am
Repository Created on April 13, 2021, 10:45 pm
Sub-second search & analytics engine on cloud storage
Last updated on December 4, 2023, 5:09 am
Repository Created on July 18, 2018, 3:30 am
A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
Last updated on December 3, 2023, 3:38 pm
Repository Created on May 21, 2009, 2:03 am
Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Last updated on December 4, 2023, 5:40 am
Repository Created on November 22, 2023, 11:36 pm
This application predicts whether a past event was a success or a failure
Last updated on November 23, 2023, 10:04 pm
Repository Created on June 16, 2020, 8:59 am
swissgeol.ch gives you insight in geoscientific data - above and below the surface.
Last updated on December 2, 2023, 3:39 am
Repository Created on May 17, 2015, 7:00 am
Apache Flink Website
Last updated on November 28, 2023, 7:48 pm
Repository Created on July 11, 2020, 10:57 pm
Fluid, elastic data abstraction and acceleration for BigData/AI applications in cloud. (Project under CNCF)
Last updated on December 3, 2023, 2:43 pm
Repository Created on April 6, 2011, 7:00 am
Apache BookKeeper - a scalable, fault tolerant and low latency storage service optimized for append-only workloads
Last updated on December 4, 2023, 12:49 am
Repository Created on November 24, 2023, 12:23 am
Un repositorio más con conceptos básicos, desafíos técnicos y recursos sobre ingeniería de datos en español 🧙✨
Last updated on December 4, 2023, 6:05 am
Repository Created on February 14, 2019, 9:21 pm
Apache Calcite Website
Last updated on July 25, 2023, 2:23 pm
Repository Created on August 8, 2017, 7:00 am
Mirror of Apache Calcite - Avatica Go SQL Driver
Last updated on July 25, 2023, 2:10 pm
Repository Created on June 28, 2021, 7:29 am
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Last updated on December 2, 2023, 10:53 am
Repository Created on February 19, 2023, 11:14 pm
MalwareDB: bookkeeping for malware, goodware, and unknown files with relationship discovery
Last updated on October 29, 2023, 12:16 am
Repository Created on November 24, 2018, 9:29 pm
Apache IoTDB
Last updated on December 4, 2023, 5:23 am
Repository Created on September 11, 2023, 4:13 am
DS5110 Big Data Systems
Last updated on September 13, 2023, 4:03 pm
Repository Created on December 5, 2018, 1:01 am
A distributed block-based data storage and compute engine
Last updated on November 28, 2023, 7:02 pm
Repository Created on April 22, 2019, 6:56 pm
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Last updated on December 4, 2023, 3:03 am
Repository Created on May 14, 2019, 7:32 am
Java library that sorts very large files of records by splitting into smaller sorted files and merging
Last updated on September 8, 2023, 5:53 pm
Repository Created on April 17, 2022, 4:27 am
🚄 FASTJSON2 is a Java JSON library with excellent performance.
Last updated on December 4, 2023, 3:54 am