If you are interested in learning more about how computers understand text and natural language processing for text speech, machine learning applications and other numerical algorithms. You should check out a new tutorial published to the NVIDIA blog which provides an introduction on how to prepare text through vectorization, hashing, tokenization, and other techniques.
The article explains how the basic algorithms computers use to convert text into vectors and how algorithms that require numerical inputs can be made to work with textual inputs, enabling developers to create text speech applications and more.
How software understands text
“Natural Language Processing (NLP) applies Machine Learning (ML) and other techniques to language. However, machine learning and other techniques typically work on the numerical arrays called vectors representing each instance (sometimes called an observation, entity, instance, or row) in the data set. We call the collection of all these arrays a matrix; each row in the matrix represents an instance. Looking at the matrix by its columns, each column represents a feature (or attribute).”
“So far, this language may seem rather abstract if one isn’t used to mathematical language. However, when dealing with tabular data, data professionals have already been exposed to this type of data structure with spreadsheet programs and relational databases. After all, spreadsheets are matrices when one considers rows as instances and columns as features. For example, consider a dataset containing past and present employees, where each row (or instance) has columns (or features) representing that employee’s age, tenure, salary, seniority level, and so on.”
For more information on how computers understand text jump over to the official NVIDIA tutorial by following the link below.
Source : NVIDIA
Latest Geeky Gadgets Deals
Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.