The Question of Maintenance of pre-trained Machine Learning Embeddings

I will address in this post the issue of maintenance of large pretrained embeddings within Artificial Intelligence (AI) services. While this issue has some links to ethical aspects (see for example the European Commission’s guidelines on trustworthy AI or here), the focus here is on maintainability of those embeddings as part of MLOps. Software Maintenance… The Question of Maintenance of pre-trained Machine Learning Embeddings weiterlesen

Secure Blockchain Analytics

Blockchain analytics has become a trending topic in recent years. This topic is of interest not only for public blockchains, such as Bitcoin or Ethereum and their Altcoins, but also for private/permissive blockchains based on various technologies. Nevertheless, there are many challenges involved, such as the large data volumes, the inefficient format for analytics, state… Secure Blockchain Analytics weiterlesen

AI Applications and Systems for Deep Logic and Probabilistic Networks

This blog post describes the integration of deep learning, logic and probabilistic reasoning to enable advanced artificial intelligence tasks. The combination of completely different set of AI approaches will be one of the key advances to support AI driven business processes in the coming years. Furthermore, I describe challenges for operating such complex AI systems… AI Applications and Systems for Deep Logic and Probabilistic Networks weiterlesen

Unikernels, Software Containers and Serverless Architecture: Road to Modularity

This blog post is discussing the implications of Unikernels, Software Containers and Serverless Architecture on Modularity of complex software systems in a service mesh as illustrated below. Modular software systems claim to be more maintainable, secure and future proven compared to software monoliths. Software containers or the alternative MicroVMs have been proven as very successful… Unikernels, Software Containers and Serverless Architecture: Road to Modularity weiterlesen

GPUs, FPGAs, TPUs for Accelerating Intelligent Applications

Intelligent Applications are part of our every day life. One observes constant flow of new algorithms, models and machine learning applications. Some require ingesting a lot of data, some require applying a lot of compute resources and some address real time learning. Dedicated hardware capabilities can thus support some of those, but not all. Many… GPUs, FPGAs, TPUs for Accelerating Intelligent Applications weiterlesen

Collaborative Data Science: About Storing, Reusing, Composing and Deploying Machine Learning Models

Why is this important? Machine Learning has re-emerged in recent years as new Big Data platforms provide means to use them with more data, make them more complex as well as allowing combining several models to make an even more intelligent predictive/prescriptive analysis. This requires storing as well as exchaning machine learning models to enable… Collaborative Data Science: About Storing, Reusing, Composing and Deploying Machine Learning Models weiterlesen

Automated Machine Learning (AutoML) and Big Data Platforms

Although machine learning exists already since decades, the typical data scientist – as you would call it today – would still have to go through a manual labor-intensive process of extracting the data, cleaning, feature extraction, regularization, training, finding the right model, testing, selecting and deploying it. Furthermore, for most machine learning scenarios you do… Automated Machine Learning (AutoML) and Big Data Platforms weiterlesen

Big Data Analytics on Excel files using Hadoop/Hive/Flink/Spark

Today we have released HadoopOffice v1.1.0 with major enhancements: Based on the latest Apache POI 3.17 Apache Hive: Query Excel files and write tables to Excel files using the Hive Serde Apache Flink support for Flink Table API and Flink DataSource/DataSink Signing and verification of signatures of Excel files Example to use the HadoopOffice library… Big Data Analytics on Excel files using Hadoop/Hive/Flink/Spark weiterlesen

HadoopOffice – A Vision for the coming Years

HadoopOffice is already since more than a year available (first commit: 16.10.2016). Currently it supports Excel formats based on the Apache POI parsers/writers. Meanwhile a lot of functionality has been added, such as: Support for .xlsx and .xls formats – reading and writing Encryption/Decryption Support Support for Hadoop mapred.* and mapreduce.* APIs Support for Spark… HadoopOffice – A Vision for the coming Years weiterlesen

HadoopCryptoLedger library a vision for the coming Years

The first commit of the HadoopCryptoLedger has been on 26th March of 2016. Since then a lot of new functionality has been added, such as support for major Big Data platforms including Hive / Flink / Spark. Furthermore, besides Bitcoin, Altcoins based on Bitcoin (e.g. Namecoin, Litecoin or Bitcoin Cash) and Ethereum (including Altcoins) have… HadoopCryptoLedger library a vision for the coming Years weiterlesen