A new capability has been added to consumer and customer surveys in the past few years: Capturing the answers of respondents to open-ended questions in video, recorded in particular by a respondent on his or her mobile phone in selfie-mode. For many users of smartphones it should come as a natural and instinctive way to voice their opinions or experiences in video when participating in surveys.

The utilisation of video recording in marketing research has become more feasible about three decades ago with the entry of video cassette cameras. On the one hand, video has been in use in social research even earlier (e.g., ethnography), enabled for example with Super 8 films. On the other hand, digital video media and camera technology have made the use of video-recorded evidence or data in research even more accessible and practical since the turn of the 21st century.  Russell Belk and Robert Kozinets discussed methods of videography in marketing and consumer research in an instructive article from 2005. What has changed the most since the early 2000s is the growing affordance of consumers and customers to take their own videos (e.g., at home, while shopping) rather than being observed and filmed by others. Videography is being dedicated more often to ‘filming’ the behaviour of consumers in various situations, events and locations (e.g., on holiday, in a hotel), possibly accompanied by the participant’s own explanations. The use of video in online and mobile surveys is a relatively late extension of video application in marketing and consumer research.

A few aspects about the recording of open-ended answers in video are suggested below for consideration:

For some consumers speaking aloud their answer to an open-ended question would indeed be easier, more natural and flowing, than writing down or actually typing their verbal answer. Speaking may be even faster than writing the answer. Taking the option of recording the answer in video on a smartphone would be convenient and appealing to those who are technically proficient and at ease in operating the video feature on his or her device. They are more likely to be younger, ages 18-35, but anyone who is more computer-oriented and comfortable also with smartphones and their applications may be able and willing to contribute his or her response in video.

However, requiring respondents to record an open-ended answer in video could lead to self-exclusion of many for whom the technical skill is not so obvious. Users who do not feel confident, are less computer-literate, or prefer to use only limited functions on their smartphones (many still use even simpler mobile phones) will be excluded by exempting themselves from answering such open-ended questions. One should also take the issue of privacy into consideration: some people may simply not wish to submit a video image of themselves to research firms and their clients because it breaks-down the custom protection of anonymity of responses. Therefore, a researcher should consider carefully who is in the target research population for the survey, and ensure that those included are likely to apply the tools needed for recording and submitting their answer in video. In other cases, respondents may be given the option to capture their answer in video or submit it in text; it would be the responsibility of the researcher to compare and integrate answers input in different modes.

Preparing transcriptions of answers from audio of the video clips can be time-consuming and also subject to errors of comprehension. It is usually necessary, however, to transfer the answers into text in order to prepare the input for analysis. Analysis is performed nowadays more frequently with text analytic tools, and in most recent years the task of transcribing may also be performed by software tools. A key challenge in analysing verbal answers given in the free language used by respondents is grasping the whole story told by a respondent. In other words, it means going beyond picking-up single key words or stand-alone short expressions. The words have to be understood in their genuine context. For example, when a customer talks about the negative or positive experience of interaction with a service agent, the event should be captured in full, not just by tracking salient and frequent key words within the answer. Of course this challenge applies to any verbal responses, submitted originally in text or orally, that are analysed with text analytic tools and use the capabilities of natural language processing (NLP).

Furthermore, if an open-ended answer is captured in a video clip, then researchers should try to take advantage of the additional layers of information made available to them:  voice and image.  Meaningful evidence may be extracted from the visual image of a respondent by tracing and coding his or her facial expressions while talking. Facial expressions can give cues of the emotions felt by the consumer-respondent (e.g., distressed and angry vs. thrilled and happy). In some cases a facial expression may uncover an emotion not expressed through the words alone (e.g., surprise, irony) that could even put what is said in words in a different light. But in order to interpret the facial expression correctly it may still be essential to evaluate it in combination with the story told by the respondent. Technical capabilities exist to extract also further meanings from voice (e.g., tone and pitch may reveal levels of arousal, intensity or excitement of the speaker). However, this area of research is still in a stage of learning and development; challenges remain, for example, in making correct inferences of feelings and emotions from facial expressions, and when aggregating evidence over a sample of survey respondents (e.g., 300 to 500 respondents).

Capturing open-ended answers in surveys via video creates new and even fascinating possibilities for researchers: to reach out to younger generations of consumers; to induce consumers to contribute their viewpoint in a way they may be more comfortable with; and  to obtain richer information. Yet, researchers should consider thoughtfully when and how it is most appropriate and effective to use the video mode for capturing the oral responses of consumers or customers.


Videography in Marketing and Consumer Research; Russell W. Belk and Robert V. Kozinets, 2005; Qualitative Marketing Research: An International Journal, 8 (2), pp. 128-141

“Pressing Fast Forward on Insight Generation with Video”, a webinar hosted by Confirmit (February 2019), with guest presenters Matt Marontate (LivingLens, SVP Sales, livinglens.tv) and Carol Fitzgerald (BuzzBack, President & CEO)

How Emotions Are Made: The Secret Life of the Brain; Lisa Feldman Barrett, 2017; UK: Macmillan

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.