The process of transforming text files using XSLT involves initially converting the text into fundamental XML format, followed by the application of an XSLT file to restructure or format that XML. XSLT enables the matching of tags and modification of content structure, thereby enhancing readability, analysis, and compatibility with other tools.
Our text corpus comes from the Cold Case TV show transcripts available at Forever Dreaming Transcripts. This was a useful source because it provides full scripts with natural, character dialogue, ideal for analyzing language patterns in a modern teen drama. To prepare the texts, we selected the first season of episodes, and saved them as plain text files for analysis.
We noticed right away when we first started to look at the texts that the transcripts were not consistent the whole way through. Which caused problems when it came down to text processing because of the inconsistencies. Members of the team also were having pushing and pulling issues due to one of the new mac updates which caused redownloads to happen. We also have had issues with making the cytoscape interactive since there was not an actual button to make an interactive embed to put in the website.