Mobile text entry datasets available for download

We have released the datasets from our OATS (Empirical investigation & user-centred development of touch-screen text entry methods older adults) project. These datasets contain the results from our text entry studies with mobile touchscreens, with older (and some younger) adults.

The datasets contain a variety of information ranging from qualitative to quantitative data captured with our keyboard prototypes and are provided under a CC-BY license to all interested researchers.

Please also feel free to download our highlighting keyboard prototype from GitHub.

Examples of the data contained in the sets:

2015-04 Injected Errors

Summary:
Our goal was to capture participants’ typing behaviour during input in a lab setting and determine the efficiency of the highlighting keyboard error support mechanism. In this experiment we introduced a new type of experiment, designed to overcome the problem of participants typing in very carefully (i.e. the “injected error” algorithm). The experiment involved both older and younger users, in different sessions.

Study date
April 2015

Contents

Data and Analysis
keyboard_data.sql: data captured from the keyboard *note* due to corruption, some data on older users is missing
Nasas+post task older: subjective feedback data from older users
Nasas+post task younger: subjective feedback data from younger users
tees_data.sql: data captured from the TEES application

Study materials
exit questionnaire: issued to participants at the end of the session
lab study info cards: given to participants during experiment
nasa: NASA-TLX and subjective feedback questionnaires given after each condition
study plan: aims and process of the experiment sessions
task order - older: participant and task order details for older adults
task order - younger: participant and task order details for younger adults

Database guide

Keyboard data: Data captured directly from the keyboard is held in 3 tables (sessions, suspects, typing events). Their structure is as follows:
- Sessions: a “session” begins when the keyboard appears on the screen and when it is hidden. Some applications cause a “keyboard show” event to occur, even if the keyboard is not actually shown, hence you may observe zero typing events for these. Typically such sessions have a “width” and “height” data value of zero (because the keyboard was not actually drawn).
id (int): the session id
session_width (int): the keyboard width in pixels
session_height (int): the keyboard height in pixels
start_time (int): timestamp of when the session began
end_time (int): timestamp of when the session ended
app (text): application or package name in which this session took place
low_errors (int): number of minor spelling mistakes in this session
high_errors (int): number of major spelling mistakes in this session
suggestions_picked (int): number of picked suggestions in this session
injections (int): number of errors injected in this session
first_word (text): the first word entered in this session
user (int): an id for the user that created the session
folder (text): experiment condition with which this data was generated (C1= Normal, C2= Normal + Injected, C3= Highlighted, C4= Highlighted + Injected)

- Suspects: a “suspect” character is the last character to be deleted by the user after a sequence of one or more backspaces, i.e. one that we consider the user typed in accidentally and then proceeded to delete.
id (int): an id for the suspect entry
sessionid (int): the session in which the suspect was recorded
suspect (text): the suspect character
user (id): the id of the user who initiated the session
folder (text): experiment condition with which this data was generated

-Typingevents: all the keyboard typing events recorded during use.
id (int): an id for the event
sessionid (int): the id of the session in which this event took place
timesincelast (int): elapsed time in milliseconds between this and the previous input event belonging to the same session
duration (int): elapsed time in milliseconds between the key_down and key_up events of this input event
rawxdown (double): screen X coordinates of the key_down event
rawydown (double): screen Y coordinates of the key_down event
rawxup (double): screen X coordinates of the key_up event
rawyup (double): screen Y coordinates of the key_up event
xdown (double): keyboard area X coordinates of the key_down event
ydown (double): keyboard area Y coordinates of the key_down event
xup (double): keyboard area X coordinates of the key_up event
yup (double): keyboard area Y coordinates of the key_up event
minoraxisdown (double): minor axis of the touch ellipse area in the key_down event
majoraxisdown (double): major axis of the touch ellipse area in the key_down event
minoraxisup (double): minor axis of the touch ellipse area in the key_up event
majoraxisup (double): major axis of the touch ellipse area in the key_up event
keycode (int): raw key code of the input event
keychar (text): character corresponding to the input event
followspace (int): 0 or 1 depending on whether this event follows a space_bar character
precedespace (int): 0 or 1 depending on whether this event preceded a space_bar character
followshift (int): 0 or 1 depending on whether this event follows a press of the shift key
user (int): id of the user who generated this event
folder (text): experiment condition with which this data was generated

TEES data: Data captured from our TEES application. The structure is as follows:
-tees_injected: all submitted data info from TEES
user (int): the TEES user id (generated by TEES)
task_type (text): the type of task undertaken by the user
len (int): total submitted input length
secs_est (int): time in seconds to complete the task, measured from the time of input of the first character
wpm_est (double): calculated words-per-minute for this task
kspc_est (double): calculated keystrokes-per-character for this task
keys_est (double): calculated total number of keys pressed during this task
backspaces (int): number of backspaces pressed during this task
wordCompleteCount (int): number of estimated times the user picked a word from a suggestion bar (more characters input on screen than actually pressed on the keyboard)
wordCompleteAve (double): average number of autocompleted words in this task
autoCorrectCount (int): number of estimated times the user corrected their entry by using autocorrect (letters changed without the user having input any keys)
autoCorrectAve (double): average number of autocorrected words in this task
longestPause (int): time of longest input pause in milliseconds during this task
totalLongish (int): total duration of rather long pauses in milliseconds during this task

- tees_injected_users: user and condition details
exp (int): the TEES experiment dataset used
user (int): the TEES user id generated for each participant for each dataset
condition (int): experiment condition (1= Normal, 2= Normal + Injected, 3= Highlighted, 4= Highlighted + Injected)
participant (text): participant initials
age (text): participant age

Download the datasets here