AI – NeuroSYS: AI & Custom Software Development

Elasticsearch – introduction to key concepts

p.kozlowski@dev.neurosys.com — Fri, 16 Sep 2022 10:28:24 +0000

The ambition behind this article

During our work in NeuroSYS, we’ve dealt with a variety of problems in Natural Language Processing, including Information Retrieval. We have mainly focused on deep learning models based on Transformers. However, Elasticsearch has often served us as a great baseline. We have been using this search engine extensively; thus, we would like to share our findings with you.

But why should you read this if you can go straight to the Elasticsearch documentation? Don’t get us wrong, the documentation is an excellent source of information, that we rely on everyday. However, as documentations do, they need to be thorough and include every bit of information on what the tool has got to offer.

Instead, we will focus more on NLP and practical aspects of Elasticsearch. We’ve also decided to split this article into two parts:

Introductory part
1. explaining main concepts,
2. pointing out what we consider to be the most important,
3. identifying the non-obvious things that might lead to errors or improper usage,
Experimental part
1. providing ready-to-use code,
2. proposing some tips on optimization,
3. presenting results of different strategies usage

Even if you are more interested in the latter, we still strongly encourage you to read the introduction.

In the following five steps, we reveal what we find to be the most important to start experimenting with your search results quality improvement.

Step 1: Understand what is Elasticsearch, and what is a search engine

Elasticsearch is a search engine, used by millions for finding query results in no time. Elastic has many applications; however, we will mainly focus on aspects most crucial for us in Natural language processing – the functionality of so-called full-text search.

Note: This article concentrates on the seventh version of Elasticsearch, as of writing this article, a more recent version 8 is already released that comes with some additional features.

Database vs. Elasticsearch

But wait, isn’t a commonly used database designed to store and search for information quickly? Do we really need Elastic or any other search engine? Well yes and no. Databases are great for fast and frequent inserts, updates, or deletes, unlike Data Warehouses or Elasticsearch.

Yes, that’s right, Elasticsearch is not a good choice when it comes to endless inserts. It’s often recommended to treat Elastic as “once built, never modified again.” It is mainly due to the way inverted indices work – they are optimized for search, not modification.

Besides, databases and Elastic differ in their use case for searching. Let’s use an example for better illustration; imagine you run a library and have plenty of books in your collection. Each book can have numerous properties associated with it, for example the title, text, author, ISBN (unique books identifier), etc., which all have to be stored somewhere, most probably in some sort of database.

When trying to find a particular book of a given author in a query, this search is likely fast. Probably even faster if you create a database index on this field. Then it is saved on a disk in a sorted manner, which speeds up the lookup process significantly.

But what if you wanted to find all books containing a certain text fragment? In a database, we would probably look at SQL LIKE statement, possibly with some wildcards %.

Soon, further questions come along:

What if you want to order the rows by how closely the text relates to what you queried for?
What if you have two fields for e.g. title and text, that you would like to include in your search?
What if you don’t want to search for the entire phrase but divide the query into separate words and accept hits containing only some of them?
What if you want to reject the commonly occurring words in the language and consider only the relevant parts of your query?

You can probably see how problematic dealing with the more complex search is when using SQL-like queries and standard databases. That’s the exact use case for a search engine.

In short, if you want to search by ISBN, title or author, go ahead and use the database. However, if you intend to search for documents based on passages in a long text, at the same time focusing on the relevance of words, a search engine, Elasticsearch in particular, would be a better choice.

Elasticsearch manages to deal with matching queries and documents’ texts through a multitude of various query types that we’ll expand further on. However, its most important feature is an inverted index, created on terms coming from tokenized and preprocessed original texts.

The inverted index can be thought of as a dictionary: we look for some word and get a matching description. So, here what it basically is, is a mapping from a single word/words to a whole document.

Given the previous example of a book, we would create an inverse index by taking the key words from a book’s content or the ones that describe it best, and map them as a set/vector, which from now on would represent that book.

So normally, when querying, we would have to go through each database row and check for some condition. Instead, we can break up the query into a tokenized representation (a vector of tokens) and only compare this vector to an already stored vector of tokens in our database. Thanks to that, we can also easily implement a scoring mechanism to measure how relatable all objects are to this query.

As a side note, it is also worth adding that each Elasticsearch cluster comprises many indices, which in turn contain many shards, also called Apache Lucene indices. In practice, it uses several of these shards at once to subset the data for faster querying.

Step 2: Understand when not to use Elasticsearch

Elasticsearch is a wonderful tool; however, as in the case of many tools, when used incorrectly,, can cause as many problems as it actually solves. What we would like you to grasp from this article is that Elasticsearch is not a database but a search engine and should be treated as such. Meaning, don’t treat it as the only data storage you have. There are multiple reasons for that but we think the most important ones are:

Search engines should only care about the data that they actually use for searches.
When using search engines, you should avoid frequent updates and inserts.

In re 1)

Don’t pollute the search engine with stuff you don’t intend to use for searching. We know how databases grow, and schemas change with time. New data gets added causing more complex structures to form. Elasticsearch won’t be fine with it; therefore, a separate database from which you can link some additional information to your search results, might be a good idea. Besides, additional data may also influence the search results, as you will find out in the section on BM25.

In re 2)

Inverted indices are costly to create and modify. New entries in Elasticsearch enforce changes in the inverted index. The creators of Elastic have thought of that as well, and instead of rebuilding the whole index every time an update happens (eg. 10 times a second), a separate small Lucene index is created (lower level mechanism Elastic builds on). It is then merged (reindex operation) with the main one. The process takes place every second by default, but it also needs some time to complete reindexing. It takes even more time when dealing with more replicas and sharding.

Any extra data will cause the process to take even longer. For this reason, you should only keep important search data in your indices. Besides, don’t expect the data to be immediately available, as Elastic is not ACID compliant, as it is more like a NoSQL datastore that focuses mainly on BASE properties.

Step 3: Understand the scoring mechanism

Okapi BM25

The terms stored in the index influence the scoring mechanism. BM25 is the default scoring/relevance algorithm in Elasticsearch, a successor to TF-IDF. We will not dive into the math too much here, as it would take up an entirety of the article. However, we will pick the most important parts and try to give you a basic understanding of how it works.

The equation might be a little confusing at first, but it becomes pretty intuitive when looking at each component separately.

The first function is IDF(q_i) – if you are comfortable with IDF (inverse document frequency), this one might be familiar to you. q_i stands for each term from a query. What it essentially does is it penalizes the terms that are found more often in all documents by counting how many times they appear in total. We would rather take into account only the most descriptive words in a query and discard the other ones.

For example:

If we tokenized the sentence, we would expect words like Elasticsearch, search, engine, querying to be more valuable than is, a, cool, designed, for, fast, as the latter ones contribute less to the essence of this sentence.

Another relevant factor is the function f(q_i, D) or frequency of the term q_iwithin document D, for which the score is being counted. Intuitively, the higher the frequency of query terms within a particular document, the more relevant this document is.
Last but not least is fieldLen/avgFieldLen ratio. It calculates how long a given document is compared with the average length of all documents stored. Since it is placed in a denominator we can observe that the score will decrease with the document’s length growth, and vice versa. So if you are experiencing more short results than longer ones it is simply because this factor boosts shorter texts.

Step 4: Understand the mechanism of text pre-processing

Analyzers

Probably the first question you’d need to raise when thinking of optimization is: how the texts are preprocessed and represented within your inverted index. There are many ready-to-use concepts in Elasticsearch, which are taken from Natural Language Processing. They are encapsulated within so-called analyzers that change the continuous text into separate terms, which are indexed instead. In “Layman’s terms”, an analyzer is both a Tokenizer, which divides the text into tokens (terms), and a collection of Filters, which do additional processing.

We can use built-in Analyzers provided by Elastic, or define our own. In order to create a custom one, we should determine which tokenizer we’d like to use and provide a set of filters.

We can apply three possible analyzers’ types to a given field, which varies based on how and when they process text:

indexing analyzer – used during the document indexing phase,
search analyzer – used to map query terms during the search, so they can be compared to terms indexed in a field. Note: if we don’t explicitly define the search analyzer, by default, the indexing analyzer for this field will be used instead
search quote analyzer – used for strict search of full phrases

Usually, there is no point in applying a search analyzer different from an indexing analyzer. Additionally, if you would like to test them yourself, it can be easily done via the built-in API or directly from the library in the language of your choice.

The built-in analyzers should be able to cover the most often used operations applied during indexing. If needed, you can use analyzers that are explicitly made for a specific language, called Language analyzers.

Filters

Despite their name, Filters not only perform token selection but are also responsible for a multitude of common NLP preprocessing tasks. They can also be used for a number of operations such as:

stemming,
stopwords filtering,
lower/upper casing,
n-grams creation on chars or words.

However, they cannot perform lemmatization. Below, we’ve listed some of the most common ones. However, if you’re interested in the complete list of available filters, you can find it here.

shingle – creates n-grams of words,
n-gram – creates n-grams of characters,
stop-words – removes stopwords,
stemmer (Porter/Lovins) – performs stemming according to the Porter/Lovins algorithm,
remove_duplicate – removes duplicate tokens.

Tokenizers

They aim to divide the text into tokens according to a selected strategy, for example:

standard_tokenizer – removes punctuation and breaks text based on words boundaries,
letter_tokenizer – breaks text on each non-letter character,
whitespace_tokenizer – breaks text on any whitespace,
pattern_tokenizer – breaks texts on specified delimiter e.g., semicolon or comma.

In the diagram below, we present some exemplary analyzers and their results on the sentence “Tom Hanks is a good actor, as he loves playing.”

Each tokenizer operates differently, so pick the one that works best for your data. However, a standard analyzer is usually a good fit for many scenarios.

Step 5: Understand different types of queries

Queries

Elasticsearch enables a variety of different query types. The basic distinction we can make is whether we care about the relevance score or not. Having this considered, we have got two contexts to choose from:

query context – calculates the score, whether the document matches a query, and how good is the match,
filter context – does not calculate the score, it only identifies the documents that match the query or not.

So, use a query context to tell how closely documents match the query and a filter context to filter out unmatched documents that will not be considered when calculating the score.

Bool query

Even though we’ve already stated that we will mainly focus on text queries, it’s essential to at least understand the basics of Bool queries since match queries boil down to them. The most significant aspect is the operator we decide to use. When creating queries, we would often like to use logical expressions like AND, OR, NOR. They are available in Elasticsearch DSL (domain specific language) as must, should, and must_not, respectively. Using them we can easily describe the required logical relationships.

Full text queries

These are the ones we are most interested in, since they are ideal for fields containing text on which an analyzer has been applied. It is worth noting that when querying each field during a search, the query text will also be processed with the same analyzer used for indexing the field.

There are several types of FTS queries:

intervals – Uses rules for matching terms, and allows for their ordering. What the query excels in is proximity searches. We are able to define an interval (since the query name) where we can look for some terms. Useful, especially when we know that the searched terms will not necessarily occur together but might appear in a predefined distance from each other. Or on the contrary, we want them to stay close together.
combined_fields – This type allows for querying multiple fields and treating them as if they were combined. For example: when querying for the first name and the last name which we might want to be paired.
query_string – A lower level query syntax. It allows for creating complex queries using operators like AND, OR, NOT, as well as, multiple fields querying or multiple additions like wildcard operators.
simple_query_string – It’s a higher level wrapper for a query_string, which is more end-user friendly.
match – “the go-to” choice for FTS, the subtypes are:
- match_phrase – designed for “exact phrases” and word proximity matching,
- multi_match – a match type that allows for querying multiple fields in a preffered manner.

We will now focus on explaining Match based queries in more detail, as we find them versatile enough to do everything we need while being pretty quick to write and modify.

Match query – this is a standard for full-text searches, where each query is analyzed the same way as the field it is matched against. We find the following parameters to be the most important ones:

Fuzziness – when searching for some phrases users can make typos. Fuzziness enables to deal quickly with such spelling errors searching for similar words at the same time. It defines the accepted error rate for each word, which is interpreted as Levenstein edit distance. Fuzziness is an optional parameter and can take values such as 0, 1, 2, or AUTO. We recommend keeping the parameter as AUTO since it automatically adjusts how many errors can be made per word depending on its length. The error distance is 0, 1, and 2 for 2, 3-5, and over 5 characters word length, respectively. If you decide to use synonyms in a field, fuzziness cannot be used anymore.
Operator – as mentioned above, a boolean query is constructed based on the analyzed search text. This parameter defines which operator AND or OR will be used, and defaults to OR. For example, the text “Super Bitcoin mining” for OR operator is constructed as “Super OR Bitcoin OR mining,” while for AND, it is built as “Super AND Bitcoin AND mining.”
Minimum_should_match – this defines how many of the terms in a boolean query should be matched for the document to be accepted. It is quite versatile as it accepts integers, percentages, or even their combinations.

Match phrase query – it’s a variation of match query where all terms must appear in the queried field, in the same order, next to each other. The sequence can be modified a bit when using an analyzer that removes stopwords.

Match prefix query – it converts the last term in the query into a prefix term, which acts as a term followed by a wildcard. There are two types of this query:

Match boolean prefix – from the terms a boolean query is constructed,
Match phrase prefix – the terms are treated as a phrase; they need to be in specific order.

When using a match phrase prefix query, “Bitcoin mining c” would be matched with both documents “Bitcoin mining center”, as well as “Bitcoin mining cluster”, since the first two words form a phrase, while the last one is considered as a prefix.

Combined fields query – allows for searching through multiple fields as if they were combined into a single one. Clarity is a huge advantage of combined fields query, since when creating this type of a query it is converted to a boolean query and chosen logical operators are used. However, there is one important assumption for combined fields query; all queried fields require the same analyzer.

The disadvantage of this query is the increased search time, as it must combine fields on the fly. That’s why, it might be wiser to use copy_to when indexing documents.

Copy_to allows for creating separate fields which combine data from other fields. Which translates into no additional overhead during searches.

Multi match query – it differs from combined fields, since it enables querying multiple fields that have different analyzers applied or even of a different type. The most important parameter is the type of a query:

best_fields – a default value, it calculates the score in each of the specified fields. Useful when we want the answer to appear in only one of the given fields instead of the terms to be found in multiple fields.
most_fields – the best when the same text can be found in different fields. Different analyzers might be used on those fields, one of them can have stemming and synonyms, while the second can use n-grams, and the last one, the original text. The relevance score combines all fields’ scores and then is divided by the number of matches in each field.

Note: best_fields and most_fields are treated as FIELD centric, meaning that matches in a query are applied per field instead of per term. For example, query “Search Engine” with operator AND means that all terms must be present in a single field, which might not be our intention.

cross_fields – is considered to be TERM centric and is a good choice when we expect the answer to be found in multiple fields. Such as, when querying for the first and the last name, we would expect to find them in different fields. Compared to most and best fields, where the terms MUST be found in the same field, here, all terms MUST be placed in at least one field. One more cool thing about cross_fields is that it can group together the fields with the same analyzer, and calculate scores on groups instead. More details can be found in the official documentation.

Boosting

We would also like to highlight that queries can be boosted. We use this feature extensively on a daily basis.

This query would multiply the score for the field Title by 2 times, Author by 4 times while Description score will remain unboosted. Boost can be an integer or a floating point number; however it must be greater or equal to 1.0.

Conclusion

To sum up, we’ve presented five steps we find crucial to start working with Elastic. We’ve discussed what Elasticsearch is, and what it isn’t, and how you’re supposed to and not supposed to use it. We’ve also described the scoring mechanism and various types of queries and analyzers.

We are confident that the knowledge collected in this article is essential to start optimizing your search results. The article was intended as an introduction to some key concepts, but also as a foundation for the next one, in which we will provide you with examples of what is worth experimenting with, and will share the code.

We hope that this blog post gave you some insight on how different search mechanisms work. We hope you’ve learned something new or handy, which one day you might find useful in your projects.

Artificial intelligence does the trick in digital transformation

Marta Dunajko — Tue, 26 Jul 2022 09:25:24 +0000

Digital transformation, our old chestnut, huh?* You might be thinking about leaving the page right now, but hold your horses as today we’ll present it to you from a brand new perspective. The perspective is, yes you’ve guessed it right, artificial intelligence. The apple of our eye and something that we have full confidence in.

* If it is not yet a broken record for you, it’s time to catch up!. Below you’ll find our other articles on digital transformation that will lay solid foundations for today’s topic.

Why is AI in digital transformation important?

Just to clarify things, digital solutions don’t equal artificial intelligence by default. They can but they don’t have to. Saying it flat out, there is no need to look for AI solutions just for the sake of it. Sometimes simply switching widely-used tools to e-tools will do the trick. However, in a lot of cases, artificial intelligence is the way to push the envelope and expand your business.

Examples of artificial intelligence solutions in digital transformation

Before we get the bit between our teeth let’s spell one thing out:

How to tell AI-powered solutions from the rest?

The easiest way to find out is to determine whether they aim to mimic intelligent human behaviour and solve the problems unsolvable for traditional algorithms, the way people would. Also, through data processing and analysis, AI algorithms should be able to learn in time and get better in what they do.

Now it’s time to put all the above into practice and show you AI-based digital transformation in action. To organise it neatly, we’ve divided the topic into five areas.

Computer vision

We use artificial intelligence to detect, recognize, and identify the contents of photos and videos. Depending on the business needs and areas to be digitalized, AI focuses on:

people and faces, as in case of entrance authentication, identifying workplace bottlenecks, determining whether employees wear protective equipment
places, e.g. localising your workers, creating self-driving industrial vehicles, locating parcels in logistics, improving workstation ergonomics (you can delve into the topic in visual place recognition and VPR Part 2)
objects – machinery automation (machines gaining sight), healthcare (disease diagnosis based on X-rays), pharma process automation (see our project on bacterial colony identification and counting), advanced quality control, e.g. elimination of impurities in the production processes, soil and crop monitoring for more adequate watering or fertilisation
text – invoice and contract automation, including optical character recognition (OCR); digitalization of all documentation and other sources (paperless factory being a thing nowadays)

We’d like to point out that computer vision is widely used in manufacturing quality control, in algorithms that don’t use AI at all. Computer vision with AI is needed in cases where conventional CV can’t figure it out, such as telling air bubbles from bacteria colonies grown on Petri dishes.

Natural language processing

With natural language processing (NLP) algorithms, digital systems can identify, understand, and analyse human language. We would like to flag up the fact that it is still one of the most challenging areas of AI and the systems don’t work perfectly. However, the new Generative Pre-trained Transformer 3 (GPT-3) seems to do the trick.

With NLP, we can speed up a lot of tasks, such as:

customer service – AI-powered chatbots answering the most common inquiries, while detection of the most sensitive cases that need an immediate reaction is possible thanks to sentiment analysis
customer profiling offering tailored solutions automatically (increasing the chances for your offer to be accepted)
semantic search helping employees to look for information in company files
classification of documents and client/patient/contractor data

Data science

Every day, your business gathers a mass of data: on your customers and their journey, operations, employee effectiveness, etc. Data science aims at uncovering intricate patterns that can help businesses to improve their processes, and eventually grow. The areas worth mentioning are:

forecasting – route planning in logistics, management of orders, forecasting the interest in particular products at a given time e.g. at Christmas, during the holiday season
risk reduction – risk analysis, predictive maintenance in manufacturing
operation efficiency improvement – bottleneck identification, resource management, waste reduction
recommender systems in e-commerce and well-targeted, more effective marketing

Similarly to the case of computer vision, we need to emphasise that not all data science mechanisms use artificial intelligence by definition. DS involves a lot of conventional statistics before it needs to reach for AI-based algorithms.

Predictive modelling

You can use predictive modelling to forecast events, customer behaviour, or market changes. Instead of analysing historical and current internal/external data manually, algorithms can do that effectively, speedily, and, most importantly, in real-time. A couple of usage examples:

sales volume prediction – for more effective production or store/hotel/restaurant service demand planning
risk calculation – commonly used in banking (among others in fraud detection), the insurance industry, manufacturing (predictive maintenance), or health care for analysing patients’ medical records

Sound recognition

Sound identification algorithms might seem less spectacular and their use limited compared to the above examples. Still, you can use them successfully in the process digitalization:

surveillance and monitoring – systems immediately detect the sound of glass breaking or any other unusual sounds, also identifying faulty machinery
voice-controlled devices and machines in manufacturing, pharma, and healthcare, which do not require taking the gloves off
automatic transcription and voice dictation converting your calls and meetings into text
assisting employees and customers with disabilities such as vision impairment

As proven with the numerous examples above, artificial intelligence plays a significant role in digital transformation. It takes operations, customer support, and daily work on a whole new level and makes businesses immune, or at least prepared, to the unexpected events. Want to try AI for yourself? We’ll be happy to help (so, make sure to contact us, we’ll take you for a test drive!).

Intro to coreference resolution in NLP

p.kozlowski@dev.neurosys.com — Wed, 04 May 2022 22:21:00 +0000

Introduction

Natural language processing (NLP) refers to the communication between humans and machines. NLP is one of the most challenging branches of Artificial Intelligence mainly because our human language is full of exceptions and ambiguities which are hard for computers to learn. One way of making it easier for them is to get rid of any imprecise expressions that need a context to be clearly understood. A good example is pronouns (e.g. it, he, her) which can be replaced with specific nouns they are referring to.

But how about a real-world application?

While working on a Question Answering System for the LMS platform we’ve encountered several problems. Especially with sentence embeddings – vector representations of text. It happens that sometimes a sentence consists of many pronouns. Such embeddings often don’t reflect the original sentence correctly when sufficient context isn’t provided. In order to obtain richer embeddings, we’ve applied coreference resolution to our pipeline.

What is coreference resolution?

Coreference resolution (CR) is the task of finding all linguistic expressions (called mentions) in a given text that refer to the same real-world entity. After finding and grouping these mentions we can resolve them by replacing, as stated above, pronouns with noun phrases.

Coreference resolution is an exceptionally versatile tool and can be applied to a variety of NLP tasks such as text understanding, information extraction, machine translation, sentiment analysis, or document summarization. It is a great way to obtain unambiguous sentences which can be much more easily understood by computers.

Coreference vs. anaphora resolution

It should be noted that we refer to coreference resolution as to a general problem of finding and resolving references in the text. However, technically there are several kinds of references and their definitions are a matter of dispute.

The one case most distinguished from coreference resolution (CR) is anaphora resolution (AR). The relation of anaphora occurs in a text when one term refers to another determining the second’s one interpretation [3]. In the example below, we see that (1) and (2) directly refer to different real-world entities however they are used in the same context and our interpretation of (2) relies on (1). These mentions do not co-refer but are in the relation of anaphora.

Even though anaphora resolution is distinct from coreference resolution, in the vast majority of cases one equals the other. There are many more examples of such differences and various other kinds of references. However, CR has the broadest scope and covers the vast majority of cases. As we would like to simplify this topic, from now on we are going to assume that all types of relations between terms are coreferential.

Different types of references

Even if we assume that we can treat all kinds of references as a coreference, there are still many different forms of relations between terms that are worth noting. That’s because every kind can be treated differently and most classic natural language processing algorithms are designed to target only specific types of references. [1]

Anaphora and cataphora

These are the bread and butter of our topic. The main difference is that anaphora occurs in the sentence after the word it refers to and cataphora is found before it. The word occurring before an anaphora is called an antecedent and the one following a cataphora is a postcedent.

Split antecedents

It’s an anaphoric expression where the pronoun (2) refers to more than one antecedent (1).

Coreferring noun phrases

It’s also an anaphoric example of a situation in which the second noun phrase (2) is a reference to an earlier descriptive form of an expression (1).

Presuppositions / bound variable

Some argue whether presupposition can be classified as a coreference (or any other “reference”) resolution type. That’s because a pronoun (2) is not exactly referential – in the sense that we can’t replace it with the quantified expression (1). However, after all the pronoun is a variable that is bound by its antecedent [3].

Misleading pronominal references

There are also certain situations that can be misleading. It’s when there is no relationship between a pronoun and other words in the text and yet the pronoun is there. While creating a CR algorithm we need to pay special attention to those kinds of references so it’s good to know in what situations we come into contact with them.

Clefts

A cleft sentence is considered to be a complex expression which has a simpler, less deceptive substitution. It’s a case where the pronoun “it” is redundant and we can easily come up with a sentence that has the same meaning but doesn’t use the pronoun.

Pleonastic “it”

This type of reference is very common in English so it requires an emphasis. The pronoun “it” doesn’t refer to any other term but it is needed in the sentence in order to make up a grammatical expression.

Steps for coreference resolution by example

It’s always best to visualize an idea and provide a concrete example as opposed to just theorizing about a topic. What’s more, we’ll try to explain and give concrete examples of the most common terms, associated with coreference resolution that we may come across in articles and papers.

The first step in order to apply coreference resolution is to decide whether we would like to work with single words/tokens or spans.

But what exactly is a span? It’s most often the case that what we want to swap or what we are swapping for is not a single word but multiple adjacent tokens. Therefore span is a whole expression. Another name for it you may come across is a mention. They are often used interchangeably.

In most state of the art solutions, only spans are taken into consideration. It is so since spans carry more information within them, while single tokens may not convey any specific details on their own.

Step 1 – identify potential spans

The next step is to somehow combine the spans into groups.

As we can see in this great quote from J.R.R. Tolkien, there are several potential spans that could be grouped together. Here we have spans like “Sam” or “his” that have only a single token in them, but we also see the span “a white star” consisting of three consecutive words.

Combining items is referred to as clustering or grouping. It is, as its name suggests, a method of taking arbitrary objects and grouping them together into clusters/groups within which these items share a common theme. These can range from words in NLP, through movie categories on Netflix, to grouping food based on their nutritional values.

There are many ways one may group, but what’s important is things in the same group should possess similar properties and be as different as possible from other clusters.

Step 2 – group spans

Here the “property” we are looking for is the spans referring to the same real-world entity.

The resulting groups are [Sam, his, he, him] as well as [a white star, it]. Notice that “Sam” and “a white star” are marked as entities. This is a crucial step in coreference resolution. We need to not only identify similar spans but also determine which one of them is, often referred to as, the real-world entity.

There is no single definition of a real-world entity but we will simply define it as an arbitrary object that doesn’t need any extra context to clarify what it is, in our example: “Sam”, or “a white star”. On the other hand, “his” or “him” are not real-world entities, since they must be accompanied by additional background information.

Step 3 – replace pronouns with real-world entities

As we can see [his, he, him] and [it] have been replaced with the real-world entities, from the corresponding groups – “Sam” and “a white star” respectively. As a result, we obtained a text without any pronouns while still being valid grammatically and semantically.

Summary

The aim of Coreference Resolution is to find, group and then substitute any ambiguous expressions with real-world entities they are referring to.

We’ve discussed a difference between coreference and anaphora resolution as well as shown and explained a couple of common problems associated with them. We’ve also managed to walk through the typical process of CR using an example.

By doing so, sentences become self-contained and no additional context is needed for the computer to understand their meaning. It won’t always be the case where we have well-defined entities but more often than not coreference resolution will lead to information gain.

This is only the first article in the series concerning coreference resolution and natural language processing. In the next one, we will show the pros and cons of the biggest deep learning solutions that we’ve tested ourselves and finally decided to implement in our system.

References

[1]: Rhea Sukthanker, Soujanya Poria, Erik Cambria, Ramkumar Thirunavukarasu (July 2020) Anaphora and coreference resolution: A review https://arxiv.org/abs/1805.11824

[2]: Sharid Loaiciga, Liane Guillou, Christian Hardmeier (September 2017) What is it? Disambiguating the different readings of the pronoun ‘it’ https://www.aclweb.org/anthology/D17-1137/

[3]: Stanford lecture (CS224n) by Christopher Manning (2019) https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1162/handouts/cs224n-lecture10-coreference.pdf

[4]: Coreference Wikipedia https://en.wikipedia.org/wiki/Coreference

Project co-financed from European Union funds under the European Regional Development Funds as part of the Smart Growth Operational Programme.
Project implemented as part of the National Centre for Research and Development: Fast Track.

Artificial Intelligence use in production process optimization

admin — Thu, 02 Dec 2021 10:36:07 +0000

Artificial Intelligence has become a buzzword indeed, not only in a business setting but also in public debate. It is used at every turn, often in the context of Industry 4.0… but how it can work in real-life scenarios – almost no one knows.

Artificial Intelligence plays a leading role in streamlining industrial processes and increasing their efficiency. Since in NeuroSYS we develop systems based on AI for a variety of purposes and industries – just have a look at our two platforms nsFlow and Samelane – I believe we can chip in on the topic, explaining the basic areas of AI algorithms usage and showing you our implementations.

The article you’re reading explores the lecture I’ve given during Global Entrepreneurship Week, November 2021.

Artificial Intelligence in everyday life

Every day we come across AI, even not knowing about it. A typical example is a speech-to-text application or Google Translate. The latter, since neural network model introduction, has excelled in translation quality. Google search engine is yet another example. It used to scan websites and provide us with a list of those, where the searched term appeared. For over a year though, it has been working differently – it tries to answer questions directly, behaving in a human manner. We can still see search results, as we used to, but often the top result is a straightforward reply. This breakthrough has been possible thanks to NLP analysis – searched phrase interpretation and context analysis.

The essence of AI

The prime objective of Artificial Intelligence algorithms is to to solve specific problems that until now only humans could and improve processes. In manufacturing, it applies to these central areas:

process automation
work safety
quality control
maintenance

Most importantly, the algorithms can learn in time and address issues from a variety of business spheres. They don’t have to be sophisticated and visually pleasing – they simply have to work.

The difference between Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL)

As you can see in the chart below, there it’s no use to compare these three terms, since

Machine Learning and Deep Learning are subareas of Artificial Intelligence. Let’s have a quick look at all three of them:

Artificial Intelligence (AI) – algorithms simulate – even if partially – intelligent/human behaviour
Machine Learning (ML) – statistical methods and algorithms that allow computers to learn automatically from data collected; machines ‘create’ algorithms themselves by uncovering patterns in real-life data – we don’t have to provide them with information on how to solve problems. They do it by gathering and classifying data.
Deep Learning (DL) – a subgroup of Machine Learning; these ML algorithms are based on deep neural networks. We don’t even define the rules, they’re created by self-learning, based on data gathered.

General uses of Machine Learning

Computer Vision (CV)

Computer Vision has been present in the manufacturing industry for years. In this form, it used to be based on classic vision algorithms and systems – cameras and laser scanners. For example, to verify whether a component is correct, we’d take a pic and then compare its measures, contrast, pixel colour, colour saturation, size – to the prototype. It’s still a popular method in quality control, when a lot of identical parts are produced and the margin of error is really narrow.

However, the same algorithms don’t cope well with production on a small scale and where the margin of error is wider. Let’s take as an example quality control in fruit sorting lines. There is no one prototype of an ideal apple – they differ in shapes, sizes, colour – but still are suitable for sale. In such cases, neural networks can be much more efficient in classifying apples by learning what a good apple means rather than just comparing images of apples to a pattern.

Natural Language Processing (NLP)

NLP aims for effective human-computer communication. To achieve that, machines have to be able to understand what we say – and ultimately mean. NLP algorithms are used in speech-to-text programs, written text analysis, or search engines. In manufacturing, they can be used to arrange information and documentation, and make it easier to search for particular details or figures. Maintenance engineers spend on average 33% of their time looking for information that they need for work, thus NLP can save them a lot of time.

Predictive analysis

Machine Learning can be used to build predictive models based on historical data to make better managerial decisions. With predictive analysis, you can plan purchases, stocking, production, and keep inventory as low as possible without a downtime risk.

Signal processing

You can use neural networks to create solutions for noise reduction, for example for remote communication of technicians in a production hall. The main problem in such conditions is that there are various noise profiles and thus background sound can be reduced in real time only thanks to noise reduction algorithms.

Recommender systems

We encounter recommender systems most often in our daily lives – doing online shopping, listening to music, or watching movies. Machine learning algorithms, based on our previous behaviour and user profiles, try to anticipate our choices and suggest us products, songs, and movie titles we will most probably like. The mechanism may be applied to the manufacturing industry as well, as a part of Industry 4.0 introduction, for example, recommending workers training they might need or documentation they should get accustomed to.

Other AI applications in the industry

Just a few more examples of how you can employ AI to optimize your industrial processes:

automate process control using reinforcement learning (e.g. control of heating boilers in electrical power and heating plants)
plan production and manage inventory using predictive models
new possibilities of robotization, e.g. pick & place solutions in packaging centres
better access to information with the NLP use

Examples of AI in production process optimization

Most of our use cases you’ll find in the case study section but let me focus on the exemplary ones.

1. Microbiological test automation in the pharmaceutical industry

Problem

In drug production, there are tens of thousands of microbiological samples monthly for each cleanroom. Clean rooms have to be constantly controlled not to contaminate the medicines. Each sample can contain one out of a thousand strains of bacteria – or none (be sterile). Laboratory technicians are needed to determine the result and count colonies. The process performed manually is time-consuming and costly. Hence there are thousands of bacteria types known to humans – and they differ in colour, shape, and colony size – and they can grow next to each other on Petri dishes, the classical algorithms couldn’t have been applied here.

Solution

We’ve introduced full automation of the incubation process, using a robotic machine with a microbiological incubator. We used a vision system that uses R-CNN neural networks for bacterial detection and automatic colony counting. Unlike a technician, it works 24/7, being effective and precise. The solution reduced the time (from 12-24 to 6-8 hours) and cost of sample analysis considerably, not to mention lowering the risk of human error.

More on the project you’ll find in a dedicated case study.

2. Employee training standardization and automation in the automotive industry

Problem

Practical training in the automotive industry requires 1:1 meetings between instructors and learners. Thus, the cost is high and there aren’t many workers that can undergo training at the same time. Moreover, learners’ results depend solely on instructors – they aren’t assessed objectively. There’s also no way to analyse the most common mistakes people make to introduce changes.

Solution

To be able to train more people at the same time, we’ve replaced 1:1 forklift training with an AI-supervised system with AR glasses. The task sequences are displayed in front of learners’ eyes and it is the system that verifies if they were carried out correctly or should be repeated. It is able to do it thanks to cameras mounted both on the forklift and in the production hall, analyzing the position of the vehicle and trainee in real-time.

3. Personal protective measures control in the manufacturing industry

Problem

Working in a production hall often requires special protective measures, such as safety glasses, hard hats, vests, or masks. Sometimes more than one is required. The problem is, employees often forget to wear them, but punishing them for the fact is demotivating and thus counterproductive.

Solution

Instead of punishing employees, we’ve built an AI system to automatically verify whether they wear face masks or not. A camera mounted at the entrance to the production hall detects people without masks and a kind reminder to put them on is displayed. It also says thank you to those who wear it already. The objective is to encourage workers to wear masks and develop this habit. The system can collect statistical data, such as the number of employees not wearing masks and what time of the day is the most sensitive one. It can work in a similar way with hard hats or protective glasses.

If I managed to pique your interest in the topic and you’d like to know more – I recommend our blog section devoted to AI and Machine Learning. As far as ready-to-go industrial solutions are concerned, go directly to our nsFlow platform.

Big data – 4 real life examples and use cases

admin — Thu, 25 Nov 2021 09:35:06 +0000

Doesn’t matter if you pronounce it data or data, the subject is of great importance.

Collecting, analyzing, and processing data is the future of understanding repetitive patterns of human behavior. Big data is a phrase that is becoming increasingly popular around the world. Seen as a powerful solution, providing valuable insights, it causes resentment due to fear of excessive surveillance by entities using it. Used mainly by analysts, it affects the lives of ordinary people as well.

Big data who?

What is big data?

Data sets coming in increasing volumes, containing greater variety and arriving with bigger velocity define big data. The three V’s are considered to be the key characteristics of it.

In simpler words, big data is larger, more complex data sets, especially coming from new sources. It’s all the data organizations possess. These sets are so extensive that traditional data processing software isn’t sufficient to manage them. The “old school” systems are designed to process highly-structured data, while modern systems incorporate artificial intelligence and machine learning to browse through information not stored in pre-set models. However, the good news is that these massive amounts of data can be used to solve business problems that a company would not have been able to handle before. Data examination serves a role in concluding already recognized internal operational areas. But wait, there’s more!

What can big data do for you?

Big data carries great potential in discovering new dependencies, solutions and learning lessons about clients, their needs and companies themselves. Data discovery covers the process of analyzing data to explore insights from carried out operations and communicating conclusions to interested parties, aiming at improving business processes. Data discovery is more than just finding new answers – it is also about asking further questions and creating hypotheses. This approach allows entities to focus on more accurate, data-based decision-making, rather than remaining stuck in information jams. With bigger, cheaper and more accessible data sets, companies seize the opportunity for making more relevant and accurate decisions.

The analysis of seemingly dispatched pieces, scraps, and collections enables seeing a bigger picture. The devil is in the details, as some say. Big data sets are the company’s capital, hiding intrinsic value. The analysis carried out on collected data enables companies to provide better services and develop new products. With the ongoing reduction of data storage and processing costs, more opportunities emerge. The below examples demonstrate the growing application of big data in everyday life.

4 examples of using big data in real life

Healthcare

Big data analytics in this sector can contribute, among others, to improving patient service, determining and implementing more effective ways of patient treatment, supporting clinical research, monitoring health care safety, creating systems ensuring management control, or counteracting epidemics and other threats. The last decade brought a tremendous advancement in the amount of data along with the ability to use technology to analyze that data and understand it. The world’s population is growing all the time and life expectancy is increasing, affecting rapid changes in treatments. Many of these decisions are dictated by data. The goal of today’s medicine is to get as much information about the patient as possible and detect the disease as soon as possible. Why? Prevention is better than the cure, obviously.

The popularity of smartphones and wearable devices is part of the revolution, as everyday technology can change our life and health for the better. Easily accessible applications monitor step count, diet progress, the quality of every night’s sleep, as well as blood pressure and breathing rate. Medical professionals can analyze gathered data and decide on future steps of treatment. Examples of big data in healthcare include opioid prevention based on the analysis of 742(!) risk factors, improving telemedicine, adjusting staffing according to patients’ inflow, and improving disease research & prediction.

Sales & marketing

The analysis of consumer behavior presents a whole new level of working with data. Thanks to information gathered from smart devices, armed with GPS technology, social media and other traces left by customers online (e.g. their purchase history or posted opinions), companies are now able to analyze the reactions of not only selected groups of customers but even specific individuals. The approach is called individualization and creating offers based on conclusions drawn from previous contacts of a given consumer with the brand – personalization. The most common use is product recommendation, providing a tailored experience in e-commerce, based on e.g. browsing history and customer behavior. Big data in marketing opens new possibilities to optimize the conversion rate, recognize the needs of user groups, improve the website’s usability and simplify purchasing processes. Drawing conclusions from mass data helps to improve customer engagement, retention, and loyalty.

Automotive & transportation

The automotive industry already adopted big data in several aspects. From manufacturing better elements and parts, to improving drivers’ safety, to enhancing car sales, data analysis is a useful tool. Car manufacturers like BMW benefit from the analysis of extensive data sets, including predictive maintenance, creating tailored customer solutions, and building cars of tomorrow using cutting-edge technology. At the verge of the autonomous cars’ era, automotive companies gather and process data from various devices, including GPS, onboard sensors and cameras. As for now, acquired data is mostly used in route planning, while the near future will bring a growing need for incorporating the data in training vehicles to operate on the streets.

Speaking of route planning, one of the most impressive use cases where data analysis enabled huge savings and optimizations is the implementation at UPS. The use of AI and big data significantly improved the global logistics network, leading to huge savings. The analysis of daily routines during parcel delivery enabled determining the most efficient paths, avoiding left turns affecting smooth rides. The result? Expected reduction of delivery miles by 100 million.

So there’s cars and logistics, what about human transportation then? One of our projects consisted of customer behavior analysis for shared mobility services, based on – unsurprisingly – data and large amounts of it. Insights on the demand for vehicles in place and time give the company the ability to optimally distribute their assets, catering to the customer’s needs in the best possible way. With data-driven solutions, our client can streamline offered services, improve user experience and, eventually, reduce customer retention and boost revenue.

Cybersecurity

Big data, thanks to its ability to quickly analyze large, diverse data sets, makes it possible to predict and prevent cyberattacks. Processing massive amounts of data allows anticipation of possible dangers and implementation of countermeasures before they become reality. Analysis of data enables quick threat detection and effective reduction of false alarms, recognizing deviations from normal activity. In this aspect, big data solutions are used by both governmental institutions and private sector entities. Federal agencies report significant decreases in security breaches due to the implementation of big data-based systems. Additionally, the number of malware and DDoS attacks, as well as insider threats has decreased since the technology gained popularity.

Where are we heading?

Some examples of big data use are already considered risky, to say the least, or dead-ends. Luckily, we do not live the reality of the 2002 sci-fi movie “Minority report”. Ideas known from the movie remain a fantasy, and some early attempts to identify individuals “most likely to commit a crime” through computer-generated lists were regarded as too far-reaching profiling. On the brighter side, big data enables improving various aspects of life, bringing value to products and services.

In the world, as we know it, amounts of data generated will continue to grow. In the digital era, data is our probably most valuable resource and product simultaneously.

With over 4.5 billion people using the internet, of which over 90% hold mobile devices, mass production of data takes place constantly. The phenomenon of big data is also associated with some challenges, arising from the complexity, security, and privacy risks, as well as the need for new technologies and human skills. Companies and governmental institutions need to create a data science culture, incorporating it into their structures. As stated in the Capgemini report, Big & Fast Data: The Rise of Insight-Driven Business, as of now, only 27% of executives consider carried out big data initiatives as successful. In the data-driven world, only by preparing for widely adopted analysis and assessment, can entities achieve insight-led operations to remain relevant.

Do you want to make use of the data your company produces and collects? Try our free consultations and see what can data do for you.

Recognition and counting of microorganisms on Petri dishes

admin — Wed, 13 Oct 2021 19:18:27 +0000

The manufacturing processes in pharmaceutical, cosmetic, or food industries are under strict policies and regulations, and obligate manufacturers to perform constant microbiological monitoring. This means thousands of samples, usually in the form of standard Petri dishes (with microbial cultures grown on agar medium), that have to be analysed and counted manually by experienced microbiologists. This is a time-consuming and error-prone process, which requires a trained professional. To avoid these issues, an automated method applied for that task would be very appreciated.

In this article we will present deep learning methods of analysing microbiological images, developed by the NeuroSYS Research team. The crucial thing in training machine learning models is to gain large, well constructed dataset. Thus we will utilize the AGAR dataset introduced in our previous post to train a model that counts and classifies bacterial colonies grown on Petri dishes based on their RGB images.

Do you know this clue? – a dumb algorithm with lots of data beats a clever one with smaller amounts of it.

Detection of microbial colonies

Ok, so let’s start with detecting microbes. Imagine that we have images of a Petri dish (this circle glass commonly used to keep growth medium for multiplying microbial cells in laboratories). Exemplary photos of such dishes in different setups of AGAR images are presented in the left column in Figure 1. The other 5 columns present fragments of photos (we call them patches) containing 5 different microbe types. Now it is easy to understand what microbe detection means. We simply have to determine the exact position and size of each microbe by marking it with a blue rectangle (we call it bounding box) as in Figure 1 and zoomed in Figure 2.

Figure 1. Different types of microbes (columns 2-6) grow inside a Petri dish (1st column) in different lighting setups (each row). Each blue bounding box was predicted by a trained detector.

It seems to be easy for trained professionals, but note that microbe colony edges may be blurred, colony itself very small (even few pixel large), or camera settings (e.g. focus or lighting) very inadequate (see for example lighting conditions in 3rd row in Figure 1). Moreover some colonies may overlap which makes decisions where one colony ends and another starts very challenging. That’s why it is really difficult to build an automatic system for microbial colony localization and classification.

Figure 2. Zoomed patches with predictions made by our detector. Note that colonies may have very different sizes: some colonies are very small, some big with blurred edges, and some may overlap — this makes detection really challenging.

To do so we have developed a deep learning model for microbes detection. Deep learning is a family of AI models that utilizes mainly artificial neural networks. Such modern approaches turn out to be extremely successful in many areas, for example in computer vision or machine translation. Deep learning object detectors (in our case detecting microbial colonies) are very complicated multistage models with hundreds of layers, each consisting of hundreds of neurons.

AI tries to solve tasks that are relatively easy for humans but extremely difficult to be programmable.

Here we adopt two-stage detectors from the Region-based Convolutional Neural Network (R-CNN) family [2,3], which are known to be slow but very precise (in comparison to single-stage detectors, e.g. famous YOLO [4]). See Figure 3 for a short explanation on how it works. For a more detailed explanation of various object detection algorithms see our previous blog post on this matter.

Figure 3. Two-stage architecture of the R-CNN detector. First stage (a) generates region proposals which are just smaller parts of the original image that we think could contain the objects we are searching for. In the second stage: (b) we take each region proposal and create a feature vector representing this area using a deep CNN, (c) and classify each of the proposals: is it relevant and if yes, what class of object does it contain. Figure adapted from [2].

Training the detector

After presenting the results of the microbe detection let’s check how the detector works. This contains a neural network supervised training process but not only: also data preprocessing and postprocessing is present in the training scheme in Figure 4. To train a deep learning model in a supervised manner we need a labeled dataset. As mentioned previously we use here AGAR dataset consisting of images of Petri dishes with labelled microbial colonies.

Characteristic feature of neural networks is that the model’s architecture strictly corresponds to the input size. When training (and evaluating) the network we are limited by available memory, thus we are not able to process the whole high resolution image at once, so we have to divide it into many patches. This process is not straightforward because during cutting into patches we have to ensure that a given colony appears in its entirety on at least one patch.

After that, we were prepared to train the detector (upper row in Figure 4). We selected 8 different models from the R-CNN family to make a comprehensive comparison. After the detectors were trained we tested them (lower pipeline in Figure 4) on photos (in fact on patches) unseen during the training to make sure that tests were done fairly. Note that the patches prepared for testing are simply cut off evenly—at this stage we cannot include information about where the bounding boxes lie.

Figure 4. Flowchart for supervised training and evaluating (testing) our neural network models of microbial colonies detector.

Detection and counting results

We have seen in Figures 1 and 2 that our models detect colonies quite well. But how to describe the performance of detection quantitatively? There are standard numbers (metrics) that we may calculate to describe performance of the model on a test set. One of them, most popular, is called Average Precision (AP) or mean Average Precision (mAP) in case of multiclass detection (for detailed definition see this post). AP and mAP results for two selected R-CNN models (Faster ResNet-50 and Cascade HRNet) evaluated on two subsets of AGAR dataset (higher- and lower-resolution) are presented in Figure 5 (table on the left).

Generally the higher AP value the more precise detection is – the predicted and true bounding boxes better fit to each other. Note, however, that the situation is a bit complicated here because we have different microbe types which means that in addition to finding colonies, detector also needs to classify them.

Different classes of microbes are being detected with different fidelites and this affects mAP as seen in Figure 5. For example small colonies albeit with sharp edges of S. aureus bacteria are detected and marked better (AP about 65%) than big but blurred colonies of P. aeruginosa (AP about 50%) that also tend to aggregate. It is also worth mentioning that our results seem to be excellent compared to reports done with the same architectures on the famous COCO dataset: 45% for Cascade R-CNN and 37% for Faster R-CNN [5].

The final task strictly related to detection of every colony on the Petri dish is counting. After detecting all the microbial colonies we sum them up and compare this number with the ground truth number of colonies for a given Petri dish. The results for counting by the same two models on the AGAR test subsets are presented in Figure 5 (plots on the right).

On the x-axis we have the ground truth number of colonies for different dishes, estimated by trained professionals, while on the y-axis we have the value predicted by our models – each pair (truth, predicted) is represented by a single black point on these plots. It is obvious that in case of ideal predictions all points should lie on the y = x curve represented by black line. Luckily, the vast majority of points lie near this curve – the models count quite well. Two additional blue curves mark +/- 10% counting error, and we may see that only some minority of points (especially higher populated dishes with more than 50 colonies) lay outside this area.

The average counting errors were measured by the mean absolute error (MAE), defined e.g. in this blog, and so called symmetric mean absolute percentage error (sMAPE), which measures accuracy based on percentage errors [6]. In general, sMAPE do not exceed 5% which is quite a reasonable result.

Figure 5. The quality of microbial colonies detection: on the left there are presented results of average precision metric that describe fidelity of detection itself, while on the right there is comparison of colonies counting—predicted vs. truth number of colonies (ideal predictions lay on y=x black curve).

Conclusions

In summary, in this article we present deep learning studies on recognition of microorganisms on Petri dishes. The selected R-CNN models perform very well in detecting microbial colonies. The detection is facilitated by the fact that the colonies have similar shapes and all species of microbes are well represented in the training data, proving the utility of the AGAR dataset. Moreover, the results obtained with base Faster R-CNN and more complex Cascade R-CNN do not differ much.

As discussed above, the detectors are more accurate for samples with less than 50 colonies. However, they still give very good estimates for dishes with hundreds or even thousands of colonies, like these presented in Figure 6, correctly identifying single colonies in highly populated samples. In the extreme case, the maximum number of detected colonies on one plate was equal to 2782. It is worth noting that it took seconds for the deep learning system, while it could take up to an hour in the case of manual counting. Moreover in some situations the detectors were able to recognize colonies difficult to see and missed by humans. These cases confirm the benefits of building an automatic microbial detection system, and that this can be successfully achieved using modern deep learning techniques.

References

[1] P. Domingos, A few useful things to know about machine learning, Commun. ACM, vol. 55, pp. 78–87, 2012.

[2] R. Girshick et al., Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, 2014.

[3] A. Mohan, Object Detection and Classification using R-CNNs, very detailed blog on RCNN models, 2018.

[4] J. Redmon et al., You only look once: Unified, real-time object detection, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.

[5] J. Wang et al., Deep high-resolution representation learning for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.

[6] S. Majchrowska, J. Pawłowski, G. Guła, T. Bonus, A. Hanas, A. Loch, A. Pawlak, J. Roszkowiak, T. Golan, and Z. Drulis-Kawa, AGAR a Microbial Colony Dataset for Deep Learning Detection, 07 July 2021, Preprint available at arXiv [arXiv:2108.01234].

Visual Place Recognition – part 2

admin — Fri, 10 Sep 2021 13:04:03 +0000

At NeuroSYS we are currently working on user localization for our AR platform called nsFlow, which enables factory workers to do their duty by displaying instructions to them through smart glasses, so no prior training or supervision are needed. As we explained in our previous post, knowledge about the employee’s location is crucial to ensure proper guidance and safety in the factory.

To solve the user localization problem we utilized algorithms from the Visual Place Recognition (VPR) field.

In the first part of this series, we provided a general introduction to VPR. Today we would like to present the solution we came up with for nsFlow.

As stated in our last post, VPR is concerned with recognizing a place based on its visual features. The recognition process is typically broken down into 2 steps. First, a photo of the place of interest is taken and keypoints (regions that stand out in some way and are likely to be also found in other images of the same scene) are detected on it. Next, they are compared with keypoints identified on the reference image and if the 2 sets of keypoints are similar enough, the photos can be considered as representing the same spot. The first step is carried out by a feature detector and the second step is performed by a feature matcher.

But how can this be applied to user localization?

Since we didn’t need the exact location of the user and we only wanted to know in which room or at which workstation he/she was staying, the problem can be simplified to place recognition. To that end, we used algorithms belonging to the group of VPR. Specifically, we focused on Superpoint and Superglue, which are currently state-of-the-art in feature detection and matching. Additionally, we applied netVLAD for faster matching.

So much for the reminder from the last post. Now let’s move on to the most interesting part of this series, which is our solution.

Fig 1. The overall architecture of our VPR system. The green path shows how the initial ranking based on global descriptors is created. The red path represents the processing of local descriptors to get the final score for a query image.

Databases

As you can see in the graph above our system contains two databases:

an image database;
a room database.

The role of the first one is to store the image of each location (possible workstation or room), as well as several additional properties, namely:

a unique identifier;
images global descriptors;
images keypoints and local descriptors.

The room database associates each unique identifier with a room. A structure like this allows the system to be distributed between local machines (room database) and a computational server (image database), thus increasing the robustness and performance. Let’s now take a closer look at some of the image properties.

Keypoints detection and matching

As stated above, a VPR system needs a feature detector to identify keypoints and a feature matcher to compare them with the database and choose the most similar image. Each keypoint contains its (x, y)^T coordinates and a vector describing it (called the descriptor). The descriptor identifies the point and should be invariant to perspective, rotation, scale and lighting conditions. This allows us to find the same points on two different images of the same place (finding the pairs of keypoints is called matching).

In our case we used a deep neural network called SuperPoint to detect keypoints. We chose it over classical methods of computing features, because it is able to extract more universal information. The other advantage of selecting SuperPoint is the fact that it performs better in tandem with the feature matching deep neural network named SuperGlue compared to other keypoint extractors.

SuperGlue also shows improved robustness in comparison to classical feature matching algorithms. In order to use it we needed to implement the network from scratch based on this paper. This was a challenge in itself and might be a topic of a future article. With our implementation we achieved results similar to those from the paper. The image below exemplifies how our network performs.

Fig 2. An example of matched keypoints found by our VPR system.

Even though SuperPoint and SuperGlue work at around 11 FPS (2x NVIDIA GeForce RTX 2080 Ti), calculating the matches for all images from the database would be ineffective and introduce high latency in the localization system. To solve this problem we added one step before local feature matching allowing us to roughly estimate the similarity and further process only the frames that are the most promising. Here we introduce the concept of global descriptors and their matching.

Global Descriptors and matching

In order to roughly estimate the similarity between two images we use global descriptors. They take the form of a vector that uniquely identifies the scene in a global sense. Here are some properties that the global descriptor should have:

it should be invariant to the point of view – the same scene viewed from different perspectives should have global descriptors that are near each other in the vector space;
it should be invariant to lighting conditions – the same scene viewed at different times of the day and under different weather conditions should have similar global descriptors;
it should be insusceptible to temporary objects – the descriptor should not encode information about cars parked in front of the building or people walking by, but only information about the building itself.

In our case we used a deep neural network named NetVLAD to calculate the global descriptors. The network returns a vector that has all the aforementioned properties.

Similarly to brute-force local descriptor matching we calculate the distances between one descriptor and all others. Then we further process the images of the top N “most similar” (closest) descriptors. This process can be called global descriptor matching.

Combining all parts together

So far we have explained the basic concepts upon which our solution is built and introduced neural networks that we used. Now is the time to combine these blocks into one working system.

As mentioned previously there exist two databases: one associating each image’s identifier with a room (this for simplicity is called the room database) and one storing more complex information about the image (keypoints and global descriptors). In order to localize the user, a query with an image of the current view is sent to the localization system. The server first calculates necessary information about the new image – its global descriptor and keypoints. Next, it performs a rough estimation of the similarity by calculating the distances between the global descriptors of the query image and images in the database. Subsequently, N records corresponding to the shortest distances are chosen and processed further by SuperGlue, which compares keypoints detected on the query image with keypoints identified on N chosen images from the database. Finally, user location is determined based on the number of matching keypoints.

That’s all we wanted to show you about VPR and our user localization system. We hope you found it interesting. In the next and last part of this series we will present how our localization system works in practice. Feel free to leave comments below if you have any questions. Stay tuned to read on!

If you want to find out more about nsFlow, please visit our website.

Do you wish to talk about the product or discuss how your industry can benefit from our edge AI & AR platform? Don’t hesitate to contact us!

3 practical AI/ML applications in pharmaceuticals

admin — Wed, 08 Sep 2021 14:41:27 +0000

The clash of two worlds, artificial intelligence and pharma, revolutionized the latter. What used to look like a pipe dream or scenario straight from sci-fi movies, now is increasingly used in real-life applications.

Artificial intelligence, one of the fastest-growing technologies of today (and tomorrow!), is changing the face of modern pharma. However, AI, with its specialized branches, like machine learning (ML) and computer science, are no beginners in the widely understood medical field. Precisely applied algorithms for cancer diagnosis, manage health information, match patients with doctors, and many more already exist.

Pharmaceutical challenges

The medical subdomain we focus on in this article, covering medicinal drugs development and manufacturing, owes much to the latest technologies. Before a medication can be marketed, great efforts must be made beforehand. Once the promising compounds are determined for trials, scientists conduct tests on the pharmacodynamics (how the substance affects functions of the body) and pharmacokinetics (what is the substances’ impact on bodily processes). In the case of traditional research and development of new medicines, the approximate time for a new drug to make its way from the discovery to entering the market takes on average 10 years, costing over $2 billion. A substantial part of overall costs is necessary failures, as among vast amounts of substances only a small percentage will eventually be authorized for use.

Artificial intelligence is applied in the healthcare industry due to algorithms’ potential to mimic human-like cognitive functions in response to surroundings. AI predicts certain outcomes based on processed data on existing items and their properties. Employing machine learning arms scientists with powerful algorithms, capable of carrying out processes based on large datasets, inevitable in research within the medical field. Machine learning plays an important role in preliminary, early-stage drug discovery, through processing datasets and analyses necessary for further steps.

Take a look at how algorithms are changing the pharmaceutical industry in a world that can’t wait for new treatments and solutions to revolutionize healthcare.

3 real-life use cases of AI/ML applications in the pharmaceutical industry

Drug discovery and development

The application of AI and ML leads to significant savings in the costly process of pharmaceutical research and development. Algorithms allow shortening of the drug discovery process, leading to faster rejection of dead-end paths in medicinal research.

For example, Exscientia, a British pharmaceutical company, developed a drug (known under the working name DSP-1181) using an algorithm. The formula is meant to aid patients with obsessive-compulsive disorder and is the third AI-powered research project of the company. According to the company’s statement, employing AI shortened research from several years to 12 months [1].

Another example of employing algorithms in drug discovery is the AI-powered research conducted at MIT focused on Escherichia coli bacteria, based on data analysis of over 2,000 growth-inhibiting particles. What followed was a molecular analysis of existing drugs performed by algorithms, the result of which was a selection of 23 structures for further testing. The research allowed scientists to pick an effective compound, halicin, capable of killing antibiotic-resistant bacteria. Scientists engaged in the works believe that, aside from achieving the set goal, the “side effects” of this particular research can result in the development of several other remedies based on conclusions on bactericidal chemical structures discovered by applied algorithms [2].

Drug screening

Algorithms support medicine development by assessing the properties of substances subjected to testing. AI predicts physical properties, bioactivity, and toxicity in a far more efficient and accurate way than traditional methods. The screening involves using high-throughput libraries, rapid tests of thousands to millions of compounds assessed for biological activity, of which particular substances cost usually up to $100. Traditionally executed screening not only costs several millions of dollars, but also takes months of work. Adopting artificial intelligence in the process shortens the screening time, as even billions of compounds can be examined within days [3].

Drug design

During a structure-based drug development process, one of the vital elements is examining the 3D structure of the selected protein. The process can be performed by e.g. homology modeling (comparative protein modeling, constructing a model of the protein, illustrating its features in an atomic-resolution figure), but usage of artificial intelligence powers the process to achieve a new quality of drug design. Utilizing deep neural networks (DNN), AI predicts protein properties and presents a score used to assess the precision of the examined 3D protein model. As a result, artificial intelligence tools, like Google’s Deepmind protein library, can find structure models matching predictions, leading to faster drug development [4].

Drug manufacturing

Automated chemical synthesis

Automation in the pharmaceutical industry increases reliability and reproducibility, presenting benefits similar to other industries, e.g. reducing the valuable expert time spent on tiresome, repetitive tasks. Contrary to previously used traditional methods, enabling a smaller range of focus on tested substances, AI-powered processes enable complex, multistep synthesis of intricate compounds for their more efficient, and faster production.

Automated microbiological analysis

At NeuroSYS, deep learning algorithms were employed to support bacterial colonies’ detection in examining images of Petri dishes. The solution, aimed at improving the accuracy of automated microbiological analysis, recognized whether the assessed samples remained free from bacteria (negative sample) or bacteria appeared on examined dishes (positive sample). Analysis of sample images was performed without human specialist supervision. The solution is applicable in the pharma industry and industrial microbiology, accelerating the process and minimizing the risk of human errors, previously leading to false-positive results. Furthermore, employing deep learning algorithms enabled avoiding unnecessarily costly multispectral cameras, improving the projects’ cost efficiency.

Drug repurposing

Algorithms can also find new paths for already known solutions. Prediction of interactions between medicines and proteins is the key to successful therapy. Using AI in drug repurposing enables faster prediction of drug-protein interactions and finding new usages, qualifying the treatment in question directly to Phase II of clinical trials, omitting Preclinical Phase (testing on non-human subjects), Phase 0 (testing pharmacokinetics, e.g. oral bioavailability), Phase I (dose-ranging clinical trials on volunteers). Successful repurposing not only shortens development time, reduces financial expenses, but also indirectly contributes to reducing the exploitation of animals in the pharmaceutical industry.

AI supporting hardware in the pharmaceutical industry

Smart devices

The pharmaceutical industry welcomes the Internet of Things with open arms. Usage of smart wearables fosters the use of miniature biosensors, supporting research in the medical field. Smart devices clear the way for gathering large amounts of data for further processing in e.g. drug efficiency analysis and improving patients’ commitment to following medical prescriptions, all in agreement with the data safety policy. In a wider approach, wearables support recognition of health issues and shortening of hospital stays by extending in-home treatment.

Preventive maintenance

There is no large-scale drug manufacturing without automation and specialized machines, working often 24/7 to meet the market demands. Similarly to other industrial branches, the pharmaceutical industry aims to avoid downtimes and unplanned shutdowns caused by equipment failures. Algorithms come to the rescue, gathering vital data via dedicated sensors, constantly monitoring components, mechanisms, and their damage. Incorporating e.g. AR solutions enables performance optimization, integration of control systems, monitoring crucial parameters, and performing maintenance without production line stoppages.

Near future

Despite occasional opinions that AI/ML adoption may be happening too fast, recognized authorities consider AI/ML the future of the medical field, with pharmaceuticals being no exception. The McKinsey report forecasts that algorithm and data-powered solutions can result in $100 billion worth annually, resulting from precise tools for research and development in the field. The use of AI/ML in healthcare creates new opportunities for more efficient, safer and quicker processes aimed at improving health and wellbeing. Verification of properly executed chemical transformations, total elimination of impurities, constant monitoring of right compound proportions, and adequate packaging of particular medications in the right dosages can all be carried out under the watchful eye of AI.

The future of AI/ML in the pharmaceutical industry will most likely involve further development of tools and pharmaceutical process automation software specialized for tasks like laboratory processes automation, property predictions, molecular generation, chemical synthesis procedures, and molecule complexity evaluation. Using algorithms not only frees staff’s hands but offers new possibilities to the industry, increasingly adopting most recent technologies.

Want to stay ahead of your competition? Book your free consultation and see how can AI power your operation!

Sources

[1] Clinical Trials Arena
[2] Massachusetts Institute of Technology
[3] Forbes
[4] Nature

In AI We Trust (but should we really?)

admin — Mon, 09 Aug 2021 10:06:40 +0000

Recent advancements in Artificial Intelligence (AI) led us to a point where AI-based technologies surround us and assist us in many daily routines. Some people may not realize it but AI is already an inherent part of our lives. Just think of how much time you spend on your smartphone which is loaded with AI algorithms. Whenever you take a photo, browse your gallery, or enjoy augmented reality features, AI assists you. Maybe you just scroll through Facebook or Instagram, or look for a new TV show on Netflix or music on Spotify – it is all powered by AI-based recommender systems. And it is not only about leisure activities. AI is already an integral part of automation and robotics, surveillance, e-commerce, agriculture 4.0, and it is also finding its way to many other sectors, like healthcare, human resources, or banking. AI wins with humans in chess, Go, and video games. It even started to replace human work in some areas. We, as individuals and as a society, have become dependent on AI. With the recent progress in deploying machine learning (ML) models on edge devices, we can only expect more and more AI systems around us. Are we ready for this? Are we reaching the technological singularity?

General artificial intelligence

Stephen Hawking once said: “A superintelligent AI will be extremely good at accomplishing its goals, and if those goals aren’t aligned with ours, we’re in trouble”. Fortunately, we have not even reached the level of general artificial intelligence. While some people believe it is not going to happen soon (or ever), many science and technology leaders are not so sure about that. Deep Mind’s researchers have just published “Reward is enough” [1]. They hypothesize that reinforcement learning agents can develop general AI by interacting with rich environments with a simple reward system. In other words, an agent, being driven by one goal – maximizing its cumulative reward – can learn the whole range of skills, which include knowledge, learning, perception, social intelligence, language, and generalisation.

It is likely that general AI is coming sooner or later. Meanwhile, the number of narrow AI solutions is growing exponentially. We need some tools to ensure that AI serves us well. Governments have already started to work on new laws regulating AI. Just this April European Commission released the Artificial Intelligence Act [2] to regulate AI technologies across the European Union.

The ethics of AI

At the same time, there is an ongoing global debate on AI ethics. However, it is difficult to define ethical AI. We do not have a precise definition of what is right and wrong – it depends on the context and cultural differences come into play. There are no universal rules that could be implemented in ethical AI systems. Not to mention an extra layer of difficulty it would introduce to the process of building deep learning solutions, which is already a hard task. It is possible that defining ethical AI will progress iteratively over time and – since we cannot predict all possible failures and their consequences – we have to make mistakes and hopefully learn from them. In any case, there are some measures we – as the community of AI developers – can take to ensure the quality, fairness, and trustworthiness of our software.

The important aspects of ML systems development

In a data-driven AI systems development, it is extremely important to understand and prepare data used to build a model. This is a necessary step to properly choose a methodology and create a reliable system. Moreover, this is required to minimize the effect of passing past prejudices and bias hidden in data into the final AI system. While it is tempting to start a project with prototyping deep learning algorithms it is crucial to first fully understand the data and underlying problems. If necessary domain experts should be involved in the process. It is also important, as for any software application, that proper security measures are applied to protect the data.

The next step in data preparation is a choice of training, validation, and test sets. Cross-validation is a commonly used technique to split a dataset into these three groups, and to verify the ability of a model to generalize for new samples. It has to be noted that while this method is widely adopted it has one drawback – there is a silent assumption that data is independent and identically distributed (the i.i.d. assumption). In most real-world scenarios i.i.d. does not hold. It does not mean that a cross-validation method should not be used, but AI developers must be aware of that to properly predict possible failures.

There is another well known problem related to the i.i.d. assumption, called domain or distribution shift. In short, it means that a training dataset is drawn from different distributions than real data used to feed an AI system after deployment. For example, a model trained solely on stock images may not work when it is later applied to users’ photos (different lighting conditions, quality, etc.) or an autonomous car which is learned how to drive during the daytime may not be able to perform this flawlessly at night. It is important that AI developers take into account that their model may fail in real life even if it works perfectly “in the lab”. And if possible, to use one of domain adaptation techniques to minimize the effect of distribution shift.

The right choice of metric is also crucial for an AI system development. Commonly used accuracy may be a good option for some image classification tasks, but it will fail to correctly represent the quality of a model for an imbalanced dataset. In such cases, F-score (the harmonic mean of precision and recall) is preferred. MAPE (Mean Absolute Percentage Error) is often chosen for regression tasks. However, it penalizes negative errors (prediction higher than true value) more than positive ones. If this is not desirable, sMAPE (symmetric MAPE) can be used instead. AI developers have to understand the advantages and shortcomings of metrics to choose the one adequate for a problem being solved.

Fig 1. The number of AI publications in the last 20 years. (source: AI Index Report 2021 [3])

Finally, one has to select an appropriate model for the task. There are thousands of AI publications every month (Fig 1). A lot of algorithms are proposed and AI developers have to choose the right one for a particular problem. It is hard to read every paper in the field, but it is necessary to at least know the state-of-the-art (SOTA) models and understand their pros and cons. It is important to follow new trends and to be aware of all groundbreaking approaches. Sometimes it is a matter of a few weeks for a model to lose its SOTA status (Fig 2).

Fig 2. Top models for image classification on ImageNet. (source: Papers with Code [4])

Understanding AI methods

Many frameworks have been released to accelerate the ML development process. Libraries like PyTorch or TensorFlow allow for quick prototyping and experimenting with various models. They also provide tools to make the deployment easy. Recently, AutoML (Automated Machine Learning) services, which allow non-experts to play around with ML models, have gained popularity. This is definitely a step forward to spread ML-based solutions across many different fields. However, choosing the right methodology and its deep understanding is still crucial to make a reliable AI system.

Regardless of the tool used, all the above aspects of the ML development process have to be considered carefully. It is important to remember that AI is goal-oriented and it may be misleading about its real performance. The researcher from the University of Washington reported how the shortcuts learned by AI may trick us that it knows what it is doing. Multiple models were trained to detect COVID-19 in radiographs and they performed very well on the validation set, which was created from the data acquired in the same hospital as the training set. However, they completely failed when applied to X-rays coming from a different clinic. It turned out that the models learned to recognize irrelevant features, like text markers, rather than medical pathology (Fig 3).

Fig 3. Saliency maps indicating the regions with the greatest influence on the models’ predictions. Strong activations come from the parts of images which do not represent medical pathology. (source: AI for radiographic COVID-19 detection selects shortcuts over signal [5])

Towards reliable and trustworthy AI

On the one hand, machine learning is now more accessible for developers and many interesting AI applications arise. On the other hand, people’s trust in AI grows inversely to the number of unreliable AI systems. It may take years before international standards arrive to certify the quality of AI-based solutions. However, it is time to start thinking about this. Recently, TUV Austria in collaboration with the Institute of Machine Learning at Johannes Kepler University released a white paper on how AI and ML tools can be certified [6]. They proposed a catalog to be used for auditing ML systems. At the moment, the procedure is provided only for supervised learning with a low criticality level. The authors list necessary requirements an AI system must meet and propose a complete workflow of the certification process. It is a great starting point to extend this in the future for other ML applications.

AI is everywhere. At this point, it “decides” what movie you are going to watch this weekend. In the near future, it may “decide” if you get a mortgage or about your medical treatment. The AI community has to make sure that the systems they are developing are reliable and fair. The AI developers need to have a comprehensive understanding of the methods they apply and data they use. Necessary steps must be taken to prevent prejudice and discrimination from data to be passed to AI systems. Fortunately, many researchers are aware of that and put a lot of effort into developing adequate tools, like explainable AI, to help create AI we can trust.

References

[1] Silver, David, et al. “Reward is enough.” Artificial Intelligence (2021): 103535.

[2] Proposal for a REGULATION OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL LAYING DOWN HARMONISED RULES ON ARTIFICIAL INTELLIGENCE (ARTIFICIAL INTELLIGENCE ACT) AND AMENDING CERTAIN UNION LEGISLATIVE ACTS

[3] Artificial Intelligence Index Report

[4] Papers with Code

[5] DeGrave, Alex J., Joseph D. Janizek, and Su-In Lee. “AI for radiographic COVID-19 detection selects shortcuts over signal.” Nature Machine Intelligence (2021): 1-10.

[6] Winter, Philip Matthias, et al. “Trusted Artificial Intelligence: Towards Certification of Machine Learning Applications.” arXiv preprint arXiv:2103.16910 (2021).

6 Artificial Intelligence/Machine Learning use cases in manufacturing

admin — Thu, 05 Aug 2021 14:35:06 +0000

The future is here.

The concept of intelligent machines has accompanied engineers since the beginning of the 20th century. What seemed once a pipe dream, finally becomes an integral part of everyday tasks.

At last, humans constructed machines capable of learning and solving problems, fueling processes of the 4th industrial revolution.

Utilizing modern solutions, including, but not limited to, industrial internet of things, cloud computing, artificial intelligence, and machine learning, companies arm their manufacturing plants for the ultimate race, remaining relevant and efficient in the reality of Industry 4.0.

Artificial intelligence

Artificial intelligence, in short terms, is the technology that powers machines to perform human-like operations. Particularly machine learning, a subset of artificial intelligence, lets machines mimic human behavior. The more complex the task, the better – as the Moravec’s paradox observes, what is simple and intuitive for humans, is the most challenging to teach to machines. The skills we learn in our first months and years of life are the hardest to be taught to robots. Conversely, what we find difficult to comprehend, perform, and optimize, are tasks that robots can take off our shoulders.

Global companies utilize AI/ML in their manufacturing plants to automate, streamline processes, and achieve unprecedented results in their fields. Examples include enterprises with large factories, carrying out diverse processes, requiring tailor-made solutions, often developed by in-house research & development departments.

6 remarkable applications of AI/ML in manufacturing

Automation and robotics

Artificial intelligence powers automation solutions in various industry branches, changing the face of mass production. Robots take on mundane, repetitive work, relieving staff from many tasks. As a 2020 report states, in 2019 there were already 2,7 million robots working in manufacturing plants worldwide. In times of the 4th industrial revolution, factories modernize their plants to remain relevant in their fields and keep pace with the ever-changing market. Standing by the assembly lines, handling raw materials, welding, machining, packaging and performing other tasks, robots keep on powering global industries. Improving operations quality, reducing downtime risk, working 24/7, also in hazardous environments and difficult conditions, AI-powered robots provide new opportunities for scalable work.

As market forecasts show, the demand for automation and robotics in manufacturing is still in high demand, both for the more traditional robots performing simple, repetitive tasks, and the most advanced collaborative robots, designed to safely collaborate with a human workforce.

Maximizing efficiency with predictive analytics

Bosch, the international engineering and technology company, aims to make its production site green by 2023 utilizing artificial intelligence and machine learning [1]. The German-based enterprise leans towards renewable energy and uses a software solution combining data on power generation and consumption. The solution uses AI/ML algorithms to forecast generating output and power requirements, creating operating schedules for power consumption. Bosch converges to renewable sources of energy and purchasing green power, but the plan wouldn’t be possible without achieving overall sustainability, met thanks to AI/ML solutions.

Predictive maintenance

Prevention is better than cure. The approach attributed to the Dutch philosopher Erasmus is manifested in technology aimed at detecting possible errors, defects in machinery or any other anomalies in processes, using data analysis. Predictive maintenance has huge significance in high-risk, hazardous industries, like mining, oil, and gas industries. Every once in a while a large-scale oil rig accident happens, its consequences are massive and affect the industry, society, and environment. Avoiding such events through employing digital solutions is nowadays a norm, not an exception. With the use of the Industrial Internet of Things (IIoT), companies like BP, Shell, ExxonMobil reduce malfunctioning risks and operational costs [2].

Detection of atypical machinery behavior covers various techniques, including infrared, vibration, and sonic analysis. IIoT devices gather information on temperature, possible misalignment of parts, or deficiency in machine lubrication, which can indicate failures in the near future. Industrial processes under the watchful eye of algorithms benefit from improved safety, productivity, energy efficiency, and overall longer machinery lifespan.

Quality assurance processes

The Bavaria-derived car manufacturing giant, BMW, incorporates AI in its production lines to improve efficiency and quality assurance. During 30 hours, the time an average car is assembled in the plant, the company gathers large data sets, which contribute to improving internal processes. Recently, BMW has begun to laser mark all metal sheets. Thanks to engraved codes, algorithms aggregate data on necessary parameters, leading to easier tracking of the manufacturing process and decreasing the number of necessary inspections. The robots working along production lines carry wear sensors, which signal the right moment for electrodes replacements, relieving staff from manually controlling the condition of electrodes. In other corners of the plant, sensors and algorithms check for acceptable levels of dusting in the paint shop, ensure the correctness of mounted elements and test the right calibration of car keys, the last example being a “home-grown” robot, Comfort Access [3].

Generative Adversarial Networks (GANs)

The other automotive giant, known for a range of American car brands, utilizes a different approach to AI in their factories. GANs are neural networks, a class of machine learning algorithms, capable of creating images based on given sets of pictures. GM is using GANs in generative design, on the border of artificial intelligence and additive manufacturing in the name of improved personalization, performance and customization. Software used by GM analyzes the design permutations, recommending the best solutions with regard to the needs and parameters (like materials, budget, etc.) of created components. Coupled with 3D printing, GANs create new opportunities in the industry previously depending on traditional methods, like injection molds. Shapes and combinations impossible to manufacture before now can revolutionize the automotive field in times of a growing demand for customization and creating individual products [4].

The need for personalized cars itself probably wouldn’t be enough to convince the decision-makers to utilize machine learning solutions. There is a greater need behind this strategy, as GANs and additive manufacturing have bets placed on them to become game-changers in building electric vehicles. Again, it’s not a fantasy but a solution to business needs, leading to improvements in e.g. mass optimization. As many drivers say, “mass is the foe of acceleration”. GM is searching for further fields of GANs application, and improving gas mileage is one of the priorities of the automotive giant.

Digital twin

A virtual depiction of physical objects, originating from NASA’s idea of improving physical models simulations, is utilized in a couple of ways in manufacturing processes. Modeling machines, their environment and data on real-time events improves sustainability, enables better operational decisions, and more advanced maintenance. The technology is among the most dynamically developing, with industries aiming to improve their processes through AI-powered monitoring, forecasting and diagnostics. From CNC machines to civil engineering to aviation, AI-powered modeling solutions let engineers assess broader insights on machinery lifecycles, tackling both the everyday and rarely occurring issues.

Among remarkable examples in manufacturing, General Electric uses digital twins to take their jet engineering projects to levels unavailable before using traditional tools. The engineering giant develops more efficient engines, compliant with modern requirements. Digital twins combine data on e.g. materials’ resilience to extreme temperatures based on information from approximately 100 sensors, leading to highly efficient, powerful designs [5].

Powered by AI algorithms, digital twins combine operational, design, sensor data to build advanced virtual models of machinery. With an ongoing influx of data, digital representation of physical objects becomes more accurate, serving industrial processes. Augmented reality solutions, like the nsFlow platform, facilitate work with digital twins, providing nearly instant access to specialist knowledge, shortening downtimes. Digital twins run real-time operations, instantly calculating data from various sources to create advanced models. With the use of DT, companies can detect abnormalities and forecast maintenance issues, preventing possible downtimes. Once an error is detected in the machine’s functioning, technicians on-site can use solutions incorporating augmented reality to resolve the problem with help from remote experts.

Where are we headed?

Contrary to popular concerns, artificial intelligence most probably won’t replace the human workforce but should complement and support it, e.g. in the pursuit of modern-world manufacturing in the spirit of Industry 4.0. As certain roles become obsolete, other will emerge. Experts call for staff reskilling and upskilling to provide workers with skills for the near future.

The World Economic Forum estimates that the exponential growth of technology may mow down 85 million jobs in upcoming years. At the same time, the report states that approximately 97 million new jobs will emerge. The next few years will show how machines and reskilled human workforce can share duties on the job, which, according to the WEF may be divided equally by half.

Sources

[1] Bosch
[2] Hydrocarbon Engineering
[3] Metrology.news
[4] Autodesk
[5] BBC