• Apache Hadoop ( /həˈduːp/) is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving...
    49 KB (5,094 words) - 23:30, 26 April 2024
  • Thumbnail for Apache ZooKeeper
    Apache Hadoop Apache Accumulo Apache HBase Apache Hive Apache Kafka Apache Drill Apache Solr Apache Spark Apache NiFi Apache Druid Apache Helix Apache Pinot...
    8 KB (714 words) - 15:45, 24 October 2023
  • Thumbnail for Apache Hive
    Apache Hive is a data warehouse software project. It is built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface...
    21 KB (2,302 words) - 03:17, 12 May 2024
  • Apache Parquet is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem. It is similar to RCFile and ORC, the other...
    9 KB (740 words) - 21:39, 3 January 2024
  • Thumbnail for Apache Avro
    remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes...
    13 KB (1,326 words) - 18:53, 24 April 2024
  • Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System) or Alluxio...
    10 KB (818 words) - 02:06, 12 April 2024
  • Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop. Impala...
    7 KB (577 words) - 03:15, 17 October 2022
  • platforms such as Apache Spark Beam, an uber-API for big data Bigtop: a project for the development of packaging and tests of the Apache Hadoop ecosystem. Bloodhound:...
    41 KB (4,615 words) - 23:38, 10 May 2024
  • Thumbnail for Apache Spark
    testing), Hadoop YARN, Apache Mesos or Kubernetes. For distributed storage, Spark can interface with a wide variety, including Alluxio, Hadoop Distributed...
    30 KB (2,732 words) - 02:20, 12 April 2024
  • Apache Pig is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform is called Pig Latin. Pig can execute...
    11 KB (979 words) - 18:51, 15 July 2022