Here are 7,532 public repositories matching this topic "spark"
Repository Created on March 3, 2014, 4:08 pm
h2o machine-learning data-science deep-learning big-data ensemble-learning gbm random-forest naive-bayes pca
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Last updated on October 2, 2023, 12:55 pm
Repository Created on September 24, 2017, 7:36 pm
nlp natural-language-processing spark spark-ml pyspark named-entity-recognition sentiment-analysis lemmatizer spell-checker entity-extraction
State of the Art Natural Language Processing
Last updated on October 2, 2023, 1:56 pm
Repository Created on March 3, 2016, 4:01 pm
Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
Last updated on September 29, 2023, 3:40 pm
Repository Created on May 16, 2022, 10:11 pm
machine-learning artificial-intelligence data data-engineering data-science python elt etl pipelines data-pipelines
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.
Last updated on October 2, 2023, 10:16 am
Repository Created on April 22, 2019, 6:56 pm
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Last updated on October 2, 2023, 2:49 pm