Thursday 1 July 2010

Musings on E-Commerce Retail Metadata

Retail based e-commerce, buying and selling products online, is a growing part of all our lives. From purchasing the latest DVD movie, ordering a part for your hand-built pc, or just treating yourself to the odd novel, using retail websites is something we all do day-in-day-out.

The business of creating operational e-commerce sites is a complex one. From relationships with data vendors and product suppliers, to product data, supply chain and pricing issues, to website navigation, search, and item ordering and fulfillment, there is a lot to consider. A great deal of work goes into creating and running the sites we use everyday, but how much do many people know about what goes on below the surface? As the swan glides smoothly across the water, how much frantic paddling is really taking place below the surface? I thought I'd take a little time to introduce a few key areas, giving a flavour of some e-commerce retail day-to-day issues and challenges.

Product Data
Metadata is a major issue for e-commerce retail websites. Two of the clearest divisions many people will see on their favourite sites is between descriptive metadata and promotional metadata. Companies often maintain divisions between these two data types. Different people may create them, in different systems and for different reasons. Different types of metadata may be used in different ways and often interacts in specific ways with each website’s search and browse functionality.

Descriptive metadata is usually linked to products in order to describe them - either for the benefit of people, IT systems or both. Taking books as an example, their product attributes can be many and varied including: title names, author names, publisher names, languages, prices, weights, number of pages, genres or subjects.

Take a look at any retail site, look closely and you'll start to see the metadata. You'll soon notice how much of it there is, and there's usually more of it behind the scenes then there is in the public-facing website pages.

How do companies deal with all this data sloshing around their web businesses?

One key concern is the need to decide on what metadata is controlled - how and why, and what metadata is not controlled. It's perfectly reasonable to have a number of free text fields, in combination with data fields that are semi-controlled and other fields which only contain pre-determined values.

For example,

Free text fields can be populated using agreed editorial guidelines. These may cover short and long descriptions of products, product reviews by users, publishers, or newspapers.

Semi-controlled fields will be constructed under tighter guidelines. These fields often include titles or sub titles.

Controlled named entity fields are a big part of the equation. These deal with the proper names of people - authors or illustrators, organisations - such as publishers, and events etc. These metadata fields are populated with controlled names taken from authority files.

Another key set of controlled vocabulary fields covers: subjects or genres, types - books, CDs etc, formats - paperbacks, or hardbacks, and audiences - children, students, teens. Values for these fields are often created in thesauri with preferred terms being linked to variant forms or synonyms, hierarchical relationships between preferred terms, and related links across them to take people from a subject like Dogs, to the related subject of Pets.

A moot point is how much control should exist behind promotional metadata versus product descriptive metadata?

I've always taken the view that usually getting the ‘best of both’ is ideal. In a fast paced area - promoting products to customers, much of the promotional metadata needs to be loosely controlled in terms of how it is created and used. Staff should be given freedom to respond to market conditions and use their initiative. However, tighter controls are often needed in terms of who creates promotional metadata. This helps to achieve a broad consistency – giving customers a framework to work within.

Customers visiting e-commerce websites are often looking for something new, and fast moving promotional metadata assists them with this need. However, customers also like to know how to get to regular promotions relating to fairly consistent products e.g. ‘CD Friday – 10% off the top 20’. Consistency comforts and reassures, unpredictability excites and enlivens – a mix can give the best result.

How is all this descriptive and promotional metadata created?

For descriptive metadata -

The needs of each product type, core customers, and the business, are assessed and a list of metadata fields created. For example, a field may be created to cover the concept of product genres. The preferred name of the field may internally be ‘Subjects’, the alternative website display name of the field may be ‘Genres’. Controlled vocabulary terms, needed for controlled vocabulary fields, are created to support product descriptions. These terms are maintained by staff, and assigned to products. For example, current genre terms are gathered and reviewed. They are approved as they are, or modified or removed. Additional controlled vocabulary is created and maintained to support customer needs. Some metadata fields will be populated from third party vendors, others will have data manually entered into them as free text, semi controlled text, named entities or controlled vocabulary terms. For example, subject entries are applied to products through mapping from vendors or entry by staff. Governance structures are created and maintained, including: descriptions outlining the data entered into each field, guidelines on where the data comes from, how it is entered and who enters it etc. For example, rules are often written explaining why a controlled vocabulary focused on subjects is needed, why the internal name is subjects and the external name is genres, how terms are created, maintained and deleted etc. What relationships exist between these terms and between related terms in other areas - relationships may be hierarchical ones – Broad term>Narrow terms, or even related ones between related concepts. Other questions usually include - how a vocabulary is used by other systems, who has the right to request additions and deletions and who has the final say.

Vocabulary development

The creation of efficient and effective controlled vocabularies - taxonomies, thesauri and ontologues - whether they support browsing or retrieval, is a controlled process based on ongoing assessment and review. It is not something that is done quickly or with little thought. Proper steps are taken to fully and effectively create and modify the necessary vocabulary types. Essentially, it is possible to create and develop a consistent and logical set of data structures behind the scenes, upon which much can be effectively built, whilst ensuring flexibility as to how data elements and the relationships between them can be displayed on websites.

For promotional metadata -

Guidelines are usually written outlining the role of promotional metadata and the ways in which it promotes sales. These guidelines describe and control the process of creating, modifying and removing promotional metadata. Specific staff create, modify and delete promotional metadata on a daily basis. The effectiveness of promotional metadata in generating product sales is analysed and changes made as needed. Effective promotional metadata is often defined as promotional data that sells more products. Ineffective promotional metadata is conversely defined as that which sells fewer products, reduces or damages the sales experience, or results in potential customers not buying and moving to rival websites.

A simple example of one possible guideline may be to ensure that in the promotional area of a website, promotional items with a short duration are always at the top of the display, whilst those with a longer duration are lower down the display. People need to see very time restricted offerings first, but like to know where longer sales promotions can easily be found.

For example

Unstructured view:

• Buy Dr Who DVDs
• 2 for 3 on CDs
• Latest Disney blu-rays
• Magic Monday – special offers
• 6 hour speed sale – click now

Structured view:

• 6 hour speed sale

• Magic Monday – special offers

• 2 for 3 on CDs

• Buy Dr Who DVDs
• Latest Disney blu-rays

The challenges e-commerce sites face are many and varied, these include:

Data mapping from vendors:
* Reviewing mapping tables.
* Documenting data needs.
* Analysing data vendor metadata for breadth, depth and accuracy.
* Negotiating with vendors regarding additional information or fixing data feed issues.
* Modifying mapping tables – changing current mappings, adding additional ones, and creating new vocabularies to map to.
* Testing and releasing updates.
* Agreeing and implementing governance rules and guidelines.

Named entity enhancements:
* Reviewing named entity files for data accuracy, breadth and depth.
* Identifying problems with current data and possible problems with the addition of data – either from vendors or through manual entry.
* Fixing vendor or data entry issues relating to new data.
* Cleaning metadata previously entered.
* Identifying named entities with more than one alternative name. For example, ‘Arthur Conan Doyle’, ‘Conan Doyle, Arthur’, ‘Conan Doyle’, ‘Doyle, Conan’, Arthur Conan-Doyle’, etc.

Data cleansing tasks to fix these kind of problems would include: identifying the named entities, choosing a preferred name based on guidelines and creating a data structure allowing the creation of a number of alternative names, which would be linked as synonyms to the preferred name. A newly cleansed vocabulary of preferred names, related to a wide number of synonyms, would assist greatly with data retrieval.

Search Support:
* Reviewing and extending search synonyms.
* Reviewing search metrics: zero hits, few hits, too many hits, searches with low product views, searches with low basket conversion rates.
* Directing the results of each search review into enhanced metadata creation, product descriptions, search effectiveness (e.g. stop words review and updating) and website usability.
* Agreeing and implementing governance rules and guidelines.

Browsing Assistance:
* Creating and displaying consistent and intuitive facets describing and promoting products.
* Creating useful divisions between descriptive and promotional categories.
* Creating processes to manage the maintenance of these divisions, the ways in which both are created and developed and the ways in which metadata in back-end systems interacts with metadata displayed on public facing websites.
* Agreeing and implementing governance rules and guidelines.

Luckily all of these challenges can be dealt with and minimised by employing the ongoing professional services of staff or consultants adept at using data analysis, metadata modelling, taxonomy, thesaurus and ontology creation and mapping to support content description and findability. When these skills are combined with current stake analysis, key task analysis and supported by the best in usability, wonderful things can be achieved.

Ian

Friday 30 April 2010

Juice Based Findability

I recently returned from an e-commerce assessment project in Cape Town. The project went well, and the client was absolutely wonderful - very welcoming and extremely keen to strengthen the asset categorisation of their products and the search and browse support they offer.

My stay was extended somewhat by the antics of an Icelandic volcano - yes me too. While I was on 'volcation' I enjoyed a number of visits to the hotel's 'full breakfast buffet'. Sitting there, sipping my coffee, I received a lesson in 'Juice Based Findability' - bear with me, it will make sense soon.

My hotel had the usual juice section - glasses close to a variety of freshly squeezed juices. I probably sat near this juice area 10 times during my recent stay. Whilst idly watching my fellow breakfasters I noticed at least 5 occasions when the guests could not find the glasses for the juice. The thing was that the glasses were lined up below the juice bar, and the table top on which the juice bar was sitting was wide enough to obscure the glasses to the guests who were standing next to the juices. On a number of occasions, guests approached the juice area intent on getting a drink, and all too often they were unsuccessful - they could just not find the glasses. Some looked around quite determinedly, some spent longer than others trying to track down the errant glasses. Some asked members of staff for help, some just walked away and got a coffee or tea instead.

Some people tried harder than others to solve the problem for themselves and get a glass of juice, but everyone with the problem was unsuccessful in solving it. The same staff were asked to solve the same problem day in day out, and yet they never altered the juice bar area. They never changed the location of the glasses or added any signage explaining the location of the glasses.

This experience is very similar to information finding challenges online. All too often sites do not make information finding tasks as simple and as fast as they should be. Also, when faced with real people having real problems, some sites ignore them, others help individuals via customer services centres, but most don't fix the root of the problem.

Faced with problems, frustrated by confusing navigation, strange search results, or missing information, most web users will go elsewhere with their business. If they do let the site owner know the problem, then please website owners, fix it at the root so other people don't encounter it.

Sometimes information architects and website owners are too close to things - too focused on their issues and their plans. They need to regularly take a step back and watch their customers and users interacting with their websites.

Next time you have a moment, look at the key information tasks your customers or clients have, sit back and ask, "How easy it it to get to the juice?" Analyse search logs, sit with people and watch them use your site, there are lots of ways to do it. Then, act on what you see, focusing on helping most of the people most of the time. I guarantee that valuable lessons will be learned and findability will improve.

Dow Jones Client Solutions offers audits targeted at improving information findability through enhanced asset categorisation, browse navigation and search support. Let me know if you would like to get more value out of your information.

Ian

Monday 22 March 2010

E-Commerce Websites - Metadata and Controlled Vocabulary Can Help

I've worked for Dow Jones Client Solutions, managing our 'Outside Americas' information consulting services, since 2006. In that time I've been involved in a wide range of projects for a variety of businesses.

Dow Jones Client Solutions offers a diverse range of information management services, amongst them services to: organise audio, video, image, and text assets, improve information browsing, provide effective search experiences and create bespoke user journeys that direct clients and customers from initial products and services to related ones.

I've been thinking a lot about e-commerce websites recently, and looking at quite a few examples of the genre. I am also a customer myself, and all too frequently come up against frustrating websites with poor search and browse functionality and a complete lack of regard for the possible customer.

Competition online is strong. It's easy for customers to move between competing websites - choosing the ones with the best experience and the right mix of products, price and customer service. Revenue and market share go to sites that offer an easy to understand information architecture - with user-friendly navigation, an intuitive and efficient search experience - with effective asset categorisation, search facets and filters, related links to products and services, and the appropriate sets of keywords to direct simple searches to the appropriate results.

Dow Jones Client Solutions offers:

* E-Commerce Assessments.
* Search and browse advice and development.
* Metadata and vocabulary development and maintenance.
* Categorization advice for text, images, video and audio assets.
* Vocabulary and metadata mapping to aid sharing and interoperability.
* Metadata and vocabulary translation and localisation.
* Information management workshops and training sessions.

If anyone reading this feels that the consulting services we offer may be of interest, I would love to arrange an quick informal call to discuss your business objectives.

I look forward to hearing from you.

Ian