Schlagwort: bigdata
-
HadoopOffice – A Vision for the coming Years
HadoopOffice is already since more than a year available (first commit: 16.10.2016). Currently it supports Excel formats based on the Apache POI parsers/writers. Meanwhile a lot of functionality has been added, such as: Support for .xlsx and .xls formats – reading and writing Encryption/Decryption Support Support for Hadoop mapred.* and mapreduce.* APIs Support for Spark…
-
Big Data Analytics on Bitcoin‘s first Altcoin: NameCoin
This blog post is about analyzing the Namecoin Blockchain using different Big Data technologies based on the HadoopCryptoLedger library. Currently, this library enables you to analyze the Bitcoin blockchain and Altcoins based on Bitcoin (incl. segregated witness), such as Namecoin, Litecoin, Zcash etc., on Big Data platforms, such as Hadoop, Hive, Flink and Spark. A…
-
Spending Time on Quality in Your Big Data Open Source Projects
Open source libraries are nowadays a critical part of the economy. They are used in commercial and non-commercial applications directly or indirectly affecting virtually any human being. Ensuring quality should be at the heart of each open source project. Verifying that an open source project ensures quality is mandatory for each stakeholder of this project,…
-
Leverage the Power of Apache Flink to analyze the Bitcoin Blockchain
The hadoopcryptoledger library has been enhanced with a datasource for Apache Flink. This means you can use the Big Data processing framework Apache Flink to analyze the Bitcoin Blockchain. It also includes an example that counts the total number of transactions in the Bitcoin blockchain. Of course given the power of Apache Flink you can think…
-
Big Data Lab in the Cloud with Hadoop+Spark+R+Python
This is an update of the second big data lab for the cloud. Similar to previous versions, this document described how you can create a Big Data Lab in the cloud on Amazon EMR. Besides some major upgrades to the newest Amazon Hadoop AMI (3.6.0) Spark (1.3.0) and R, it includes now also the possibility…
-
Example projects for using various NoSQL and Big Data technologies
Recently, I published on github.com several example Java projects for using various NoSQL technologies: cassandra-tutorial : Apache Cassandra tutorial (Column-oriented database) mongodb-tutorial : Mongo DB tutorial (Document database) neo4j-tutorial : Neo4J (Graph Database) redis-tutorial : Redis (Key/Value Store) solr-tutorial : Apache SolrCloud (Search technology) Other example Java projects aim at standardized big data processing platforms:…