Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Tokenized Text Word Position in Output
#1
Hi, I have collected data from a word-by-word self-paced study on FindingFive and am currently in the process of analyzing it. Although, FindingFive does record the RTs per each word (see yellow column in screenshot for RTs and blue column for each word in the sentence), the output does not provide a column that allows the researcher to see the position of each word in the full sentence. So far, I have been filtering my experimental conditions in Excel and labelling each word position manually (see green column). However, I was wondering if there is a more efficient way of doing this? Whether it be by using a specific argument I can add to the tokenized text option, or if someone has an automated Excel sheet or R script that can facilitate this process. Thanks in advance!


Attached Files Thumbnail(s)
   
Reply
#2
Hi that's a great question! I think the excel trick is pretty nice, although a bit labor-intensive. In R, you can probably do something like a loop where you look for unique trial number + response names, and within each unique pair, use seq(1, length(pair)) to generate the word position numbering.

Just a thought! Let us know how it goes!
Reply
#3
Thanks Ting! Here is what worked for me on R:

# Break the dataframe into a list of dataframes based on 'trial_num'
df_list <- split(data, data$trial_num)

# Add row index as a column to each dataframe
df_list <- lapply(df_list, function(df) {
  df$word_position <- 1:nrow(df)
  return(df)
})

# Combine the dataframes back together
combined_df <- do.call(rbind, df_list)

# Sort the combined dataframe by 'part_number' and 'trial_num'
combined_df <- combined_df %>% arrange(part_number, trial_num)

# View the resulting dataframe
print(combined_df)
Reply
#4
Thanks for sharing the code with our community!
Reply
#5
Understanding tokenized text word positions helps one analyze language models more effectively. It allows for precise manipulation and interpretation of text data, ensuring accurate processing and improving performance in tasks like translation, sentiment analysis, and natural language understanding.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)