Category Archives: Publishing Industry

Posts about the publishing industry.

Full IAB taxonomy for content classification now available

The surge of content marketing, based on the creation and distribution of valuable content, is a reality. The objective is no longer solely to advertise but to engage and offer valuable experiences to the market as well. In this context, precision in content classification is becoming essential. And that’s why we implemented a new automatic content classifier according to the full IAB taxonomy.

Therefore, since including tier 3 of the content taxonomy to the categorization categories of our IAB model last year, we have chosen to complement it by providing the complete ontology. In line with the Content Taxonomy, published in 2017 and updated in 2020 by the IAB Tech Lab (Interactive Advertising Bureau), this classification includes the remaining 60 categories that make up tier 4.  A total categorization of over 698 categories, hierarchized into four tiers, is thus, offered. We have kept the unique identifier that IAB assigns to each one of its categories, as well as their name, through which its parent categories can be found.

Continue reading


IAB Taxonomy Level 3 now available in our Deep Categorization API

IAB - Interactive Advertising BureauDigital marketing is becoming a fundamental pillar, by leaps and bounds, in the business plans of practically every business model. Methods are being refined and the search for the connection between brand and user is expected to become increasingly more precise: a related advertisement is no longer sufficient, now the advertisement must appear at the right time and in the right place. This is where categorization proves to be an exceedingly useful tool.

That is why, at MeaningCloud, we have improved our IAB categorization model in English, that is integrated in our Deep Categorization API:

  • Adding a third level of content taxonomy to the hierarchy of categories (IAB Taxonomy Level 3).
  • Improving the precision of pre-existing categories.
  • Including the unique identifiers defined by IAB itself for each of the categories.

Continue reading


Communication during the Coronavirus (I): Thematic analysis in Spanish digital news media

While it is obvious that the priority during this pandemic is to cure the sick, to prevent new cases from surfacing and to ensure there are economic and social measures in place to help the people and businesses most afflicted overcome the current situation; without a doubt, in the near future, the analysis of content related to the coronavirus that has been generated by the media and social network users will be the object of research for numerous disciplines such as sociology, philology, linguistics, audio-visual communication, and politics, to name a few.

At MeaningCloud we want to do our bit in this area, by applying our experience and our Text Analytics solutions to analyze the enormous volume of information in natural language, in Spanish and in other languages, in Spain and in other countries, given that, unfortunately, this is a global crisis.

This first article in the series centers on the thematic analysis of content that has been generated in Spanish by digital media platforms in Spain over the last month, how it has evolved during this period of time and the informative positioning of the main media platforms in Spain.

These other articles (only available, at the moment, in Spanish) analyse conversation topics on Twitter in Spain (both from the hashtags and general topics perspective and also applying a specific thematic categorization) and the linguistic analysis of presidential speeches related to this crisis.

Continue reading


Updated version of the IAB model in the Deep Categorization API

IAB - Interactive Advertising Bureau

The Interactive Advertising Bureau (IAB) is perhaps the most influential organization in the online advertising business and, currently, brings together more than 650 leading companies in the industry that control 86% of the U.S. market. With a strong presence in the rest of the industrialized world as well, today IAB has become a standard for content classification, especially in fields with strong ties to the digital economy and new social media.

In fact, IAB promotes advertising techniques like behavioral targeting, which allows advertisers to direct marketing campaigns to specific users (according to their age, place of residence, political views, interests, etc.) and thus increase their effectiveness. What’s more, the organization is making consistent progress in the field of geotargeting, an area of digital marketing that is on the rise thanks to the unprecedented diffusion of mobile devices connected to the Internet and the latest advances in Internet-of-things technologies. Continue reading


Books Are a Service

Semantic Publishing and Voice of the Customer understanding for the media&content industry

The reason for publishing being a key industry to take advantage of text analytics is also the reason why the industry finds it so hard to engage with the technology.

Books are a serviceThe reason? Text. And a lot of it. The publishing world has struggled to understand how data relates to text and understand the value of data. This is changing, too slow for many, as the industry moves from seeing themselves as a ‘product’ based company (e.g. making books, e-books or physical) to a ‘service’ based company. In other words smart publishers are starting to see their service to customers as the creator and curator of information. This content is abled to be mixed and mashed-up in dynamic ways across a number of formats. This service is not bound, saddle-stitch or otherwise, to a specific product. This 180-degree perspective change requires publishers to think more directly about customer experience in the same way more traditional service based industries like hospitality or even retail banking.

Continue reading


Text Analytics for Publishing: there’s metadata and smarter metadata

Everyone agrees metadata is great. It helps simplify the management and packaging of content and data. It creates consistency and provenance of your content and data across an organization. Metadata gives you that 35000 feet perspective that is needed to make strategic decisions. This is especially important for publishers whose stock in trade is human language, which is completely opaque to machines whose world consists of zeros and ones. Your customers aren’t calling or emailing you to know what is in such and such database. No. They are contacting you because they want to know what monographs you have by such and such professor or asking you for all the archival material on ‘cats’, ‘World War 2’ or ‘nanotubes’. As a human, you understand exactly what they are looking for. If your ICT has a smidgeon of metadata, you can dig around that such-and-such database and deliver the content and have a happy customer.

Intelligent content for Semantic Publishing

Metadata TagMetadata makes your content more intelligent. That’s why everyone agrees metadata is great. Great until they have to either enter the metadata or maintain the vocabularies. Some organizations are lucky. They have ensured there is support within the workflow and people with the expertise to do the hard work so when that customer searches on the website, they quickly find what they are looking for and go away happy. But, even those lucky few do not live in isolation. There is no publisher of consequence who doesn’t have do deal with 3rd party content and data. A huge amount of additional effort is spent shoehorning 3rd party content into the metadata models of the organization. Every publisher has a workflow that includes completely throwing away existing metadata and spending additional time and wasteful effort to add metadata that their CMS can handle. Does that sound familiar? Does it feel better to know you aren’t the only one?

Continue reading


#ILovePolitics: Popularity analysis in the news

If you love politics, regardless of your party or political orientation, you may know that election periods are exciting moments and having good information is a must to increase the fun. This is why you follow the news, watch or listen to political analysis programs on TV or radio, read surveys or compare different points of view from one or the other side.

American politics in a nutshell

American politics

Starting with this, we are publishing a series of tutorials where we will show how to use MeaningCloud for extracting interesting political insights to build your own political intel reports. MeaningCloud provides useful capabilities for extracting meaning from multilingual content in a simple and efficient way. Combining API calls with open source libraries in your favorite programming language is so easy and powerful at the same time that will awaken for sure the Political Data Scientist hidden inside of you. Be warned!

Our research objective is to analyze mentions to people, places, or entities in general in the Politics section of different news media. We will try to carry out an analysis that can answer the following questions:

  • Which are the most popular names?
  • Does their popularity depend on the political orientation of the newspaper?
  • Is it correlated somehow to the popularity surveys or voting intentions polls?
  • Do these trends change over time?

Before we begin

This is a technical tutorial in which we will develop some coding. However, we will try to guide you through the whole process, so everyone can follow the explanations and understand the purpose of the tutorial.

For the sake of generality and better understanding, we will focus on U.S. Politics in English, but obviously you can easily adapt the same analysis for your own country or (MeaningCloud supported) language.

And last but not least, this tutorial will use PHP as programming language for the code examples. However, any non-rookie programmer should be able to translate the scripts into any language of their choice.

Continue reading


The Analysis of Customer Experience, Touchstone in the Evolution of the Market of Language Technologies

The LT-Innovate 2014 Conference has just been held in Brussels. LT-Innovate is a forum and association of European companies in the sector of language technologies. To get an idea of the meaning and the importance of this market, suffice it to say that in Europe some 450 companies (mainly innovative SMEs) are part of it, and are responsible for 0.12% of European GDP. Daedalus is one of the fifteen European companies (and the only one from Spain) formally members of LT-Innovate Ltd. since its formation as an association, with headquarters in the United Kingdom, in 2012.

LTI_Manifesto_2014

LT-Innovate Innovation Manifesto 2014

In this 2014 edition, the document “LT-Innovate Innovation Manifesto:” Unleashing the Promise of the Language Technology Industry for a Language-Neutral Digital Single Market” has been published. I had the honor of being part of the round table which opened the conference. The main subject of my speech was the qualitative change experienced in recent times by the role of our technologies in the markets in which we operate. For years we have been incorporating our systems to solve in very limited areas the specific problems of our more or less visionary or innovative customers. This situation has already changed completely: language technologies now play a central role in a growing number of businesses.

Language Technologies in the Media Sector

In a recent post, I referred to this same issue with regard to the media sector. If before we would incorporate a solution to automate the annotation of file contents, now we deploy solutions that affect most aspects of the publishing business: we tag semantically pieces of news to improve the search experience on any channel (web, mobile, tablets), to recommend related content or additional one according to the interest profile of a specific reader, to facilitate findability and indexing by search engines (SEO, Search Engine Optimization), to place advertising related to the news context or the reader’s intention, to help monetize content in new forms, etc.

Continue reading


Semantic Publishing: a Case Study for the Media Industry

Semantic Publishing at Unidad Editorial: a Client Case Study in the Media Industry 

Last year, the Spanish media group Unidad Editorial deployed a new CMS developed in-house for its integrated newsroom. Unidad Editorial is a subsidiary of the Italian RCS MediaGroup, and publishes some of the newspapers and magazines with highest circulation in Spain, besides owning nation-wide radio stations and a license of DTTV incorporating four TV channels.

Newsroom El Mundo

Newsroom El Mundo

When a journalist adds a piece of news to the system, its content has to be tagged, which constitutes one of the first steps in a workflow that will end with the delivery of this item in different formats, through different channels (print, web, tablet and mobile apps) and for different mastheads. After evaluation of different provider’s solutions in the previous months, the company then decided that semantic tagging would be done through Daedalus’ text analytics technology. Semantic publishing included, in this case, the identification (with disambiguation) of named entities (people, places, organizations, etc.), time and money expressions, concepts, classification according to the IPTC scheme (an international standard for the media industry, with around 1400 classes organized in three levels), sentiment analysis, etc.

Continue reading