Taggers, Extractors, Classifiers

The purpose of KBpedia is to make the combination of its public knowledge bases — Wikipedia, Wikidata, OpenCyc, DBpedia, GeoNames, and UMBEL — computable. Any of the major structural features of KBpedia — concepts, entities, relations, attributes, or events — may be the computable basis for training taggers, extractors or classifiers.

 
 

For example, there are about 33,000 different entity types in KBpedia that may be used as the basis for training entity recognizers. KBpedia's standard knowledge bases may be combined with your own data and sources.

A typical new learner begins with slicing-and-dicing KBpedia into appropriate positive and negative training sets. These slices can be further distinguished by an unparalleled ability to select from a rich variety of structural and semantic features in the source data. Unsupervised learning may also be applied at this stage to engineer further features. Standard NLP and machine learning reference standards and statistics are applied during the parameter-tuning and learning phase. Multiple learners and recognizers may also be combined as different signals to an ensemble approach to overall scoring.

Alternatively, these same slicing-and-dicing capabilities may drive export routines to use your own local machine learners or third-party ML services.

The nature of the domain problem at hand — which may emphasize different data types, source input formats, or domain data — often suggests the need for tailored means to tease out more precise scrapers or recognizers. Customized development for bespoke needs is an additional service that Cognonto may include with its tagging, extracting and classifying services.

Please contact us to inquire about your needs. Or, return to the main Cognonto Services page.