This page lists some experiments I did with my collocations and N-grams data in late 2018. My aim was to give an example of how the data could be further processed to obtain new knowledge.

I did five experiments, which are summarised below. The link to the results folder below will take you to a directory from where you can download the results.

The set of five experiments was not planned. I did each experiment without any expectation of doing any more. I have not gone back to re-edit what I wrote, so the documents in the results folder present my thoughts as they developed at the time. These documents were written as examples, and were not intended for submission to a journal. I circulated them privately to a few scholars for their interest.

Experiment 1 - Which N-Grams are the Best?

In this experiment I used my formal N-grams data to try to establish what kinds of N-grams are the best for authorship attribution.

Experiment 2 - Which N-Grams are the Best, part 2?

I continued experiment 1, adding more detail to the results and refining the conclusions.

Experiment 3 - A Control Test

I did a control test, to show that the method I used in the previous experiments is not biased.

Experiment 4 - Arden of Faversham and the Extended Kyd Canon

I enlarged the set of plays used in the above experiments to include the so-called Extended Kyd Canon defined by Sir Brian Vickers.

Experiment 5 - Whole Canons Method

In the final experiment I sketched a new method of authorship attribution and did a preliminary test with it. A great deal more work would need to be done if I were to develop the method further, particularly for use with scenes and acts rather than whole plays.

Browse Results (approx. 173 Mb)