I found some beautiful posters that showed the punctation in different novels the other day. I was immediately curious if I could do something similar and wrote a little script (code here) to extract the punctation and print out a compressed representation of my favorite novels.
Then, like any proper scientist, I looked at the data and did some simple stats. Go see on Medium!
Here are a few of the (almost) full sets of punctuation from a couple of novels. For Pride And Prejudice, note the zoom-in versus the zoom-out:
Please let me know in the comments how totally I was wrong in my Medium analysis, and if there is anything you would like to see.
Update: Here a couple I thought were interesting. First, part of the Tractatus Logico Philosophicus:
And then Ulysses, the difference between the beginning of the book (first) and the end of the book (second):