Scientists like me

I wanted to know how to find other scientists doing similar (but different!) work to me. I like to think that I know most of the people working on nearby topics, but what about people who take similar approaches on totally different problems? There were a lot of good suggestions (especially the neuromatch algorithm), but I want to highlight two in particular:

Michael Hendricks mentioned the Journal/Author Name Estimator (JANE), which takes your abstracts and tries to figure out who you are most similar to. When I throw in a few of my abstracts I mostly get C. elegans people I know:

Annika Barber had an even better suggestion; playing around with NIH Matchmaker to find grants that are similar. It pulls up a lot of really interesting projects! Especially when I remove the name of my model organism.

3% of Neuroscientists are here for revenge

I was curious how people got into neuroscience. Random happenstance? A lifelong love of gap junctions?  So I asked about it on twitter and got hundreds of responses.

I did a quick analysis of about half the responses, putting them in different categories. It quickly became clear that certain themes were popping up again and again:

It doesn’t surprise me too much that a lot of people became interested in neuroscience for a special reason: they cared about learning or decision-making or free will. A lot of you are here because a particular book or lecture was so good it blew you away. I was surprised by the number of people who were accidentally exposed to neuroscience because the class they wanted to take was filled, or it was a distributional requirement at their university, or there happened to be someone next door who was doing research on it. It turns out that serendipity is a major driver of passion!

This suggests that the best way to get other people interested in neuroscience might be to just explain it to them.

Definite shout-out to the 9% of you who are here because of the drugs and 3% of you who are doing this for revenge.

#cosyne2020, by the numbers

Cosyne is the largest COmputational and SYstems NEuroscience conference. Many many years ago, I thought it would be a good idea to study the conference. Who goes? Who dominates the conference? If this is the place where people come to exchange ideas, it is useful to know who is doing that and who is dominating the conversation.

The first thing I look at is who is most active (who is an author on the most abstracts) – and this year it is a four-way tie between Larry Abbott, Mehrdad Jazayeri, Jonathan Pillow, and Byron Yu who I dub this year’s Hierarchs of Cosyne. The most active in previous years are:

  • 2004: L. Abbott/M. Meister
  • 2005: A. Zador
  • 2006: P. Dayan
  • 2007: L. Paninski
  • 2008: L. Paninski
  • 2009: J. Victor
  • 2010: A. Zador
  • 2011: L. Paninski
  • 2012: E. Simoncelli
  • 2013: J. Pillow/L. Abbott/L. Paninski
  • 2014: W. Gerstner
  • 2015: C. Brody
  • 2016: X. Wang
  • 2017: J. Pillow
  • 2018: K. Harris
  • 2019: J. Pillow
  • 2020: L. Abbott/M. Jazayeri/J. Pillow/B. Yu

If you look at the most across all of Cosyne’s history, you can see things shift. Looking across time below, you can see that Jonathan Pillow is starting to catch up with Liam Paninski and is breaking away from Larry Abbott. The other startling ascent is Carlos Brody – there’s a whole lot of Princeton going on at Cosyne.

What are in the abstracts? In the past I have tried to find words that are more common in accepted than rejected abstracts. I can visualize this using everyone’s favorite data visualization technique, WOOOORD CLOOOUUUDS. If you wanted to get accepted, it would have been better to write about decision-making trajectories using stable optogenetic attractor choices and worse to write about intrinsic geometry algorithms in tools and datasets.

What is more common in accepted abstracts today than when Cosyne was in Denver two years ago? There are fewer intrinsic dendritic attention pathways and more primate context shape timescales.

At the Cognitive Computational Neuroscience (CCN) conference last fall, Richard Gao presented a super cool poster where he analyzed conference abstracts from different computational neuroscience conferences and used word2vec to make useful embeddings (which is hard! – I have tried this before and failed). I asked him if he could take a crack at this year’s Cosyne abstracts and he was kind enough to agree.

Just looking at the most common topics it looks like recurrent and deep networks are, uh, very popular.

His word2vec embedding representations only need about ~5 PCs to capture most of the variance in the words. COSYNE is low-dimensional 😦

But this is really cool: he used UMAP to look at how similar the embeddings were in different topics. It looks to me like there are classic sensory/processing abstracts in the top left, decision-making in the top right, and models on the bottom? Maybe?

And if he performs hierarchical clustering:

Finally Richard can look at how similar words are in the abstracts. What is most similar to dimensionality-reduction? RNNs.

What are most like MANIFOLDS? Deep networks, population coding, and cerebellum (???).

Who looks at oscillations? People who study pyramidal neurons and hippocampus.

All of this can be found in a notebook at Richard’s GitHub.

Finally, how is everyone connected? I have plotted everyone who is attending Cosyne2020, where connections are between any two people who have co-authored an abstract. Please note that for technical and historical reasons, I find authors by (first initial, lastname). This leads to some ambiguity because sometimes two people share this ID.

Click the picture for a high-res PDF.

There are two many people who have attended Cosyne throughout the years to meaningfully visualize everyone so I have split them into two groups. The Superusers – people who have been an author on 10+ abstracts, and their co-authors who have also been on 10+ abstracts.

Probably more interesting to most people is the graph of the Regulars – people who have been on 5+ abstracts.

I’m just going to pull out a few (colored) clusters here:

And finally, the connected components of everyone at Cosyne2020!

 

#CCN2019 by the numbers

I am in Berlin for the Cognitive Computational Neuroscience (CCN) conference. It is an interesting collection of people working on more human (though some animal) cognitive neuroscience, often using neural network models. In its third year, CCN is an interesting contrast to Cosyne, a conference more focused on traditional systems neuroscience along with computational modeling.

While I’m here, I thought I would do a quick analysis along the lines of what I have done in years past for Cosyne. I only have one year’s worth of data so there is a limit on what I can analyze but I wanted to know – who was here?

The most posters (abstracts) were from the lab of Simon Kelly. There is not a lot of overlap here with Cosyne with the exception of Tim Behrens – a bountiful contributor to the Cosyne 2019 conference as well – and perhaps Wei Ji Ma? So it seems the communities are at least somewhat segregated for now.

There is also the co-citation network. Who was on posters with whom? That is above (click through for high-resolution PDF). There are ~222 connected components (distinct subgraphs, visualized at the top of the page) and the largest connected component is relatively small. It will be interesting to see if this changes as the community coheres over the next few years.

That’s it for this year! Next year I will try to take data from the (short) life history of the conference.

An ethology reading list

At a meeting in New York last week [edit: many months ago by the time I got around to posting this], we were discussing the recent push in neuroscience for more naturalistic behaviors. One of the problems, someone pointed out, is that they are difficult to analyze. But surely there must be whole fields devoted to understanding natural behaviors? Why do we, as neuroscientists, not interact with them?

When I started this blog I named it neuroecology for exactly that reason: there was this whole field of ecology that has thought about natural behaviors very deeply for a long, long time and going over those papers on a blog seemed like a great way to understand them. What I didn’t understand at the time was that I was using the wrong word; it wasn’t ecology that I was looking for it was ethologyEcology is more generally about broad interactions between animals and environments. Ethology is the specific study of animal behavior.

So: ethologists. The studiers of natural animal behavior. What can neuroscientists learn from these mythical beings? I tried to collect as many syllabi as I could find (1, 2, 3, 4, 5, 6,  with thanks to Bence Ölveczky for sending me theirs in personal communication) to find papers that neuroscientists will find relevant for understanding how to analyze natural behaviors (with a few that I think are relevant thrown into the mix).

Consider this post a “living document” that I will update over time. Mostly it is a big list of papers that I have separated into sub-topics that drastically need to be cleaned up. If I’m missing something, let me know!

Continue reading

Please help me identify neuroscientists hired as tenure-track assistant profs in the 2018-19 faculty job season

For the past two years, I tried to crowd-source a complete list of everyone who got hired into a neuroscience faculty job over the previous year. I think the list has almost everyone who was hired in the US… let’s see if we can do better this year?

I posted an analysis of some of the results here – one of the key “surprises” was that no, you don’t actually need a Cell/Nature/Science paper to get a faculty job.

If you know who was hired to fill one or more of the listed N. American assistant professor positions in neuroscience or an allied field, please email me with this information (neurorumblr@gmail.com).

To quote the requirements:

I only want information that’s been made publicly available, for instance via an official announcement on a departmental website, or by someone tweeting something like “I’ve accepted a TT job at Some College, I start Aug. 1!” If you want to pass on the information that you yourself have been hired into a faculty position, that’s fine too. All you’re doing is saving me from googling publicly-available information myself to figure out who was hired for which positions. Please do not contact me to pass on confidential information, in particular confidential information about hiring that has not yet been totally finalized.

Please do not contact me with nth-hand “information” you heard through the grapevine. Not even if you’re confident it’s reliable.

I’m interested in positions at all institutions of higher education, not just research universities. Even if the position is a pure teaching position with no research duties.

#Cosyne19, by the numbers

As some of you might know, there’s been a lot of tumult surrounding this year’s Cosyne (Computational and Systems Neuroscience) conference. The number of submissions skyrocketed from the year before and the rejection rate went from something like 40% to something like 60% – there were over 1000 abstracts submitted! Even crazier, there is a waitlist to even register for the conference. So what has changed?

Lisbon, Lisbon, Lisbon. This is the first year that the conference has been in Europe and a trip to Portugal in the middle of winter is pretty appealing. On the other hand, maybe Cosyne is going the way of NeurIPS and becoming data science central? Let’s see what’s been going on.

You can see from the above that the list of most active PIs at Cosyne should look pretty recognizable.

First is who is the most active – and this year it is Jonathan Pillow who I dub this year’s Hierarch of Cosyne. The most active in previous years are:

  • 2004: L. Abbott/M. Meister
  • 2005: A. Zador
  • 2006: P. Dayan
  • 2007: L. Paninski
  • 2008: L. Paninski
  • 2009: J. Victor
  • 2010: A. Zador
  • 2011: L. Paninski
  • 2012: E. Simoncelli
  • 2013: J. Pillow/L. Abbott/L. Paninski
  • 2014: W. Gerstner
  • 2015: C. Brody
  • 2016: X. Wang
  • 2017: J. Pillow
  • 2018: K. Harris
  • 2019: J. Pillow

If you look at the most across all of Cosyne’s history, you can see things shift and, remarkably, someone is within striking distance of taking over Liam Paninski’s top spot (I full expect him to somehow submit 1000 posters next year despite there being a rule specifically designed to limit how much he can submit!).

It is interesting to look at the dynamics through time – I have plotted the cumulative posters by year below and labeled a few people. It looks like you can see when the Paninski rule was implemented (2008 or 2009) and when certain people became PIs (Surya Ganguli became suspiciously more productive in 2012).

Adam Charles suggested that we should looked at viral – if a person’s ideas were a disease, who would spread their ideas (diseases) most effectively? Working from a measure defined here, he calculated the most viral people at Cosyne19:

And also if you normalize for the number of nodes the people are directly connected to:

And similarly for Cosyne 2004 – 2019:

In other words, are you viral because you are linked to a lot of people who are in turn are linked to a lot of people (top figure)? Or are you viral because you are connected to a broad collection of semi-viral co-authors (bottom figure)?

It’s been remarked that the lists above are pretty male-heavy. I thought that maybe the non-PIs would be more diverse? So I plotted the number of posters from 2013 – 2019 (mislabeled below) where I have author ordering: how many posters does each person have that are non last or second-to-last authors, given that are not on the PI list above? The list below is, uh, not any better at representation.

 

What is it that got accepted in Cosyne19? These are the most common words in the abstracts:

These are the words that are more popular in 2019 than in the 2018 abstracts:

Conversely, these are the words that are less popular this year than previous years. Sorry, dopamine.

I had thought that maybe the increased popularity at Cosyne was because of an increase in participation from NeurIPS refugees. If so, it doesn’t show up in the list of words above. I tried various forms of topic modeling to try to parse out the abstracts. I’ve never found a way of clustering the abstracts that I find satisfying – the labels I get out never correspond to my intuition for how the subfields should be partitioned – but here is an embedding using doc2vec of all the abstracts from 2017 – 2019:

And here is an embedding in the same space but only for 2019 abstracts. Not so different!

And if we look at the number of abstracts that contain word relating to different model organisms – or just “modeling”, “models”, “simulations”, etc, we see it’s stayed pretty much the same year-to-year.

Maybe it is a different group of people who are at the conference? Visualizing the network diagram of co-authorships reveals some of the structure in the computational neuroscience community (click image for zoomable PDF):

Some highlights from this:

IDK WTF is going on at the Allen Institute but I like it:

Geography is pretty meaningful. The Northeast is more clustered than you would expect from chance:

As are the Palo Alto Pals

Here is a clustering of everyone who has been to Cosyne since 2004 and has at least five co-authors. It’s a mess! (click image for zoomable PDF)

Okay this grouping looks pretty similar. Are they the same people? If I look at the proportion of last authors on each abstract who have never been to Cosyne before, it looks like the normal level of inflow – no big new groups of people.

But the number of authors on each abstract has grown pretty heavily:

One thing that is changing is the proportion of authors who belong to the largest subgraphs of the network – that is, who is connected to the “in-group” of Cosyne. And the in-group is larger than ever before:

It’s a bit harder to see here – partly because there are two large subgraphs this year instead of one big glob – but mean path length (how long it takes to get from one author to another) and the network efficiency (a similar metric that is more robust to size) all indicate a more dispersed set of central clusters. I’m not quite sure why, but it is possible that the central group is replicating itself. You are getting the same people still weakly connected to former PIs/collaborators opening their own labs, getting a little further away but not too far…

All in all, it looks like there was an increase in submissions – probably because of the European/Lisbonian location – but no real change in the submissions that were accepted.

Interesting neuro/ML discussions on twitter, 1/9/19

It seems like it might be useful to catalogue the interesting twitter threads that pop up from time to time. They can be hard to parse and easy to miss but there is a lot of interesting and useful stuff. I am going to focus on *scientific result*-related threads. I don’t know if this will be useful – consider it an experiment. Click on the tweets to read more of the threads.

(Click below the fold) Continue reading

How to tweet about your science #sciencestorm #bitesizescience

Everyone should tweet about their science. Not only will other scientists on Twitter see it, but plenty of other scientists who are not active on Twitter – but pay attention to it! – will see it as well. But the way that you write your tweet will make a huge difference in the amount of attention it gets. No matter how interesting your science is, no matter how finely crafted your paper is, if a tweet isn’t written well it won’t diffuse through Twitter very well.

I’m also going to see if I can start a hashtag: something like #sciencestorm or #bitsizescience added to the first tweet that describes science. I love reading these stories and I wish they were easier to find. Adding a hashtag lets people quickly search for them and find them.

Here are a few tips in no particular order:

  1. Don’t just tweet the title of your paper and add a link. That’s a good first start – but what you really want is a series of tweets that slowly explains what you found. Think about this as your chance to provide an accessible narrative about what you found and what you think is most interesting.
  2. Be excited about your research! People will be excited for you. It’s infectious seeing how happy and excited people are when their papers are published. They want to be supportive and congratulate you.
  3. People want to learn something. If you can condense the messages of your paper into short facts, it will get more traction.
  4. Always, always, always include an image. It almost doesn’t matter what the image is – just being there will add a huge uptick in people paying attention to it. But the best image lets a person look at it and understand the paper in a single shot. It can be a figure from your paper, it can be a schematic, it can be a few figures. People want to learn something.
  5. If you can, add a video. People love videos even more than they love images! Doing optogenetics? Show a video of a light going on and your animal immediately changing their behavior – people love that shit. Doing cell biology? Show a video of a cell moving or changing or something.
  6. This is stupid, but the time that you tweet matters a bit. Be aware that fewer people are paying attention at 2AM PST than, say, 9AM PST. Think about who – across the world – is awake when you are tweeting.

Let’s go through three examples (which I have been trying to collect here).

The first is a series of tweets (“tweetstorm”) by Carsen Stringer describing her work looking at the fractal dimension of neural activity. Now just typing those words I’m thinking “ugh this sounds so complicated” – and I have a masters in math! But that’s not how she described it. She slowly builds up the story starting with the central question, providing examples, explaining concepts. Even if you have no clue what fractal dimensionality means you will learn a lot about the work and get excited by the paper. In a way that a single tweet would not.

She also makes sure to use explanatory pictures well. Even in the absence of explanation, the simple act of having a picture drives people to engage with the tweet. Look at these examples side-by-side:

Which of the two above looks more interesting? The plain boring text? Or the text with some friendly fox faces? Pictures make a bigger difference than you’d think (which is not to say that every tweet needs a picture – but they help, a lot).

Another example is this from Michael Eisen. This is in a slightly different style that starts off describing the historical background:

What the tweetstorm also provides is insight into how they made their discovery. You get to feel like you are being carried along their scientific process!

The final example got me right away. I saw this tweet and I couldn’t help but smile. Dom Cram is studying meerkats so he made some meerkat legos. I didn’t even know if I cared about the study but I definitely cared about looking at more lego meerkats (and then I realized I thought the study was interesting)…

If you enjoy this kind of thing, get creative! It’s fun and people want to have fun and learn about your science at the same time.

 

Can we even understand what the responses of a neuron ‘represent’?

tl;dr:

  • Deep neural networks are forcing us to rethink what it means to understand what a neuron is doing
  • Does it make sense to talk about a neuron representing a single feature instead of a confluence of features (a la Kording)? Is understanding the way a neuron responds in one context good enough?
  • If a neural response is correlated with something in the environment, does it represent it?
  • There is a difference in understanding encoding versus decoding versus mechanistic computations
  • Picking up from an argument on twitter
  • I like the word manticore

What if I told you that the picture above – a mythical creature called a manticore – had a representation of a human in it? You might nod your head and say yes, that has some part of a human represented in it. Now what if I told you it had a representation of a lion? Well you might hem and haw a bit more, not sure if you’d ever seen a grey lion before, or even a lion with a body that looked quite like that, but yes, you’d tentatively say. You can see a kind of representative lion in there.

Now I go further. That is also a picture that also represents a giraffe. Not at all, you might say. But I’d press – it has four legs. It has a long body. A tail, just like a giraffe. There is some information there about what a giraffe looks like. You’d like at me funny and shrug your shoulders and say sure, why not. And then I’d go back to the beginning and say, you know what, this whole conversation is idiotic. It’s not representative of a human or a lion or a giraffe. It’s a picture of a manticore for god’s sake. And we all know that!

Let’s chat about the manticore theory of neuroscience.

One of the big efforts in neuroscience – and now in deep neural networks – has been to identify the manticores. We want to know why this neuron is responding – is it responding to a dark spot, a moving grating, an odor, a sense of touch, what? In other words, what information is represented in the neurons responses? And in deep networks we want to understand what is going on at each stage of the network so we can understand how to build them better. But because of the precise mathematical nature of the networks, we can understand every nook and cranny of them a bit better. This more precise understanding of artificial network responses seems to be leading to a split between neuroscientists and those who work with Deep Networks on how to think about what neurons are representing.

This all started with the Age of Hubel and Wiesel: they found that they could get visual neurons to fire by placing precisely located circles and lines in front of an animal. This neuron responded to a dark circle hereThat neuron responded to a bright line there. These neurons are representing the visual world through a series of dots and dashes.

And you can continue up the neural hierarchy and the complexity of stimuli you present to animals. Textures, toilet brushes, faces, things like this. Certain neurons look like they code for one thing or another.

But neurons aren’t actually so simple. Yes, this neuron may respond to any edge it sees on an object but it will also respond differently if the animal is running. So, maybe it represents running? And it also responds differently if there is sound. So, it represents edges and running and sound? Or it represents edges differently when there is sound?

This is what those who work with artificial neural networks appreciate more fully than us neuroscientists. Neurons are complex machines that respond to all sorts of things at the same time. We are like the blind men and the elephant, desperately trying to grasp at what this complex beast of responses really us. But there is also a key difference here. The blind men come to different conclusions about the whole animal after sampling just a little bit of the animal and that is not super useful. Neuroscientists have the advantage that they may not care about every nook and cranny of the animal’s response – it is useful enough to explain what the neuron responds to on average, or in this particular context.

Even still, it can be hard to understand precisely what a neuron, artificial or otherwise, is representing to other neurons. And the statement itself – representation – can mean some fundamentally different things.

How can we know what a neuron – or collection of neurons are representing? One method that has been used has been to present a bunch of different stimuli to a network and ask what it responds to. Does it respond to faces and not cars in one layer? Maybe motion and not still images in another?

You can get even more careful measurements by asking about a precise mathematical quantity, mutual information, that quantifies how much of a relationship there is between these features and neural responses.

But there are problems here. Consider the (possibly apocryphal) story about a network that was trained to detect the difference between Russian and American tanks. It worked fantastically – but it turned out that it was exploiting the fact that one set of pictures were taken in the sunlight and another set was taken when it was cloudy. What, then, was the network representing? Russian and American tanks? Light and dark? More complex statistical properties relating to the coordination of light-dark-light-dark that combines both differences in tanks and differences in light intensities, a feature so alien that we would not even have a name to describe it?

At one level, it clearly had representations of Russian tanks and American tanks – in the world it had experienced. In the outside world, if it was shown a picture of a bright blue sky it may exclaim, “ah, [COUNTRY’S] beautiful tank!” But we would map it on to a bright sunny day. What something is representing only makes sense in the context of a particular set of experiences. Anything else is meaningless.

Similarly, what something is representing only makes sense in the context of what it can report. Understanding this context has allowed neuroscience to make strides in understanding what the nervous system is doing: natural stimuli (trees and nature and textures instead of random white noise) have given us a more intimate knowledge of how the retina functions and the V2 portion of visual cortex.

We could also consider a set of neurons confronted with a ball being tossed up and down and told to respond with where it was. If you queried the network to ask whether it was using the speed of the ball to make this decision you would find that there was information about the speed! But why? Is it because it is computing the flow of the object through time? Or is it because the ball moves fastest when it is closed to the hand (when it is thrown up with force, or falls down with gravity) and is slowest when it is high up (as gravity inevitably slows it down and reverses its course)? Yes, you could now read out velocity if you wanted to – in that situation.

There are only two ways to understand what it is representing: in the context of what it is asked to report, and if you understand precisely the mechanistic series of computations that gives rise to the representation – and it maps on to ‘our’ definition.

Now ask yourself a third question: what if the feature were somehow wiped from the network and it did fine at whatever task it was set to? Was it representing the feature before? In one sense, no: the feature was never actually used, it was just some noise in the system that happened to correlate with something we thought was important. In another sense, yes: clearly the representation was there because we could specifically remove that feature! It depends on what you mean by representation and what you want to say about it.

This is the difference in encoding vs decoding. It is important to understand what a neuron is encoding because we do not know the full extent of what could happen to it or where it came from. It is equally important to understand what is decoded from a neuron because this is the only meaningfully-encoded information! In a way, this is the difference between thinking about neurons as passive encoders versus active decoders.

The encoding framework is needed because we don’t know what the neuron is really representing, or how it works in a network. We need to be agnostic to the decoding. However, ultimately what we want is an explanation for what information is decoded from the neuron – what is the meaningful information that it passes to other neurons. But this is really, really hard!

Is it even meaningful to take about a representation otherwise?

Ultimately, if we are to understand how nervous systems are functioning, we need to understand a bit of all of these concepts. But in a time when we can start getting our hands around the shape of the manticore by mathematically probing neural responses, we also need to understand, very carefully, what we mean when we say “this neuron is representing X”. We need to understand that sometimes we want to know everything about how a neuron responds and sometimes we want to understand how it responds in a given context. Understanding the manifold of possible responses for a neuron, on the other hand, may make things too complex for our human minds to get a handle on. The very particulars of neural responses are what give us the adversarial examples in artificial networks that seem so wrong. Perhaps what we really want is to not understand the peaks and troughs of the neural manifold, but some piecewise approximations that are wrong but understandable and close enough.

Other ideas

  • Highly correlated objects will provide information about each other. May not show information in different testing conditions
  • What is being computed in high-D space?
  • If we have some correlate of a feature, does it matter if it is not used?
  • What we say we want when we look for representations is intent or causality
  • This is a different concept than “mental representation
  • Representation at a neuron level or network level? What if one is decodable and one is not?
  • We think of neurons as passive encoders or simple decoders, but why not think of them as acting on their environment in the same way as any other cell/organism? What difference is there really?
  • Closed loop vs static encoders
  • ICA gives you oriented gabors etc. So: is the representation of edges or is the representation of the independent components of a scene?