Being A Scientist

From Colettapedia
Jump to: navigation, search

Human Subjects Research

Human Subjects Research Protections

Statistical Inference

Statistical Inference


  • lemma - a proven statement used as a stepping-stone toward the proof of another statement


  • Act like everyone in your field is a collaborator. Read other people's papers and cite other people's data in presentations.
  • Find some rich teritory to mine!
  • Where is the new theory here?
  • Where is the new territory? How are you moving the ball forward?
  • What is this thing that the world needs to see?
  • you might throw your paper up on, to get initial peer review.
    • Nikita: "Posting rejected paper on ArXiv might enable unscrupulous editors to steal work for them and their students."
  • "Impact Factor"
  • publishing a lot of papers in top tier journals that get cited often.
  • "Publications are the currency of science." -Shepard
  • Need to justify our work. Show that you're still in the business. "We have to publish in IEEE. We are IEEE"
  • The astrophysics community published on arxiv when published, and when submitted, in order to make sure that they get the credit - Lior. "Showing your cards and your data before all the bets are in." -Mark
  • "AI in Medicine", "Ploss" Cell Biology
  • First tier versus second tier journals - it's still peer reviewed, and it still citable, but less of an impact factor
  • How many people are reading an publishing in your field? Several thousand.
  • Deadlines for publication
  • You have to stand on the shoulders of giants in order to be reasonable. else be banised to write an algorithm and publish it and nobody uses it for 15 years.
  • provide better signal
  • "a demonstration project"
  • "there's a ton of work" - "Minoru doesn't need anybody suggesting new experiments to him."
  • "good data that's also free is hard to come by."
  • don't bias your data - have to go into the experiment completely blind, and let the data speak for itself.
  • you can think of a million experiments to do - but you should only do ones that are informative.
  • Keep the tone down when talking about other people's work. No direct insults. If anything imply in a non-threatening manner shortcomings and limitations of what other people have done.
  • Ill-posed vs well posed problem


  1. Develop method
  2. Demonstrate feasibility of method
  3. Validate method
  4. Parameter scan

Scientific Writing

  • Ilya: "Try to look at every word and phrase you write and attempt to eliminate it. Try to be as brutal as possible while doing this. Scientific writing is all about density - use as few words as possible, then try to use even fewer. All this without sacrificing any of the detail or ideas you're trying to convey. Density. Mine is a little shorter, but gives a little more detail, without eliminating anything you said. "

Experiment Setup

  • A clearly defined problem with a clear endpoint
  • random error - fix with larger n, averaging
  • systematic error - a system bias is unrecoverable! no increase in n will help you
  • how do you know that you're done?

Data Collection

  • break open the bottleneck, what's the new bottleneck?
  • using equipment at its limit
  • live-mission-critical data vs. data that are scientific artifacts - making sure it endures, archived, made available to community

Type of Data

  • a baseline - something for other scientists to compare changes to
  • cross-section - the data is not homogeneous, from a variety of specimens that differ via one or more controlled variables.


  • your collaboration would result in getting a new result from your data
  • important to figure out which of the individual collaborators is claiming what, purposes of attribution and taking credit.
  • know where each other's feet ar so as not to step on them.
  • being pessimistic - what's the least I can get out of this collaboration?


  • publish method, but not results
  • Talk about method before you do it
  • talk about results before you publish it
  • preliminary results are key
  • have to bring something at least a little half-baked


  • Start with the simplest model, one with no predictors, and add complexity to it.
  • Data with Primary Sampling unit less than 100? Get fucked!
  • No throwing good work after bad.
  • Not interesting.
  • Go to NCBI and set up email alerts for papers in your project area. This may save you months of you working using a different method, rather than play catchup.
  • The currency in this group is the notebook. Name them by date first. Leave breadcrumbs for others to follow. Git is nice to keep track of what you did, probably overkill.
  • Don't start a project in which you intend to publish on a public dataset unless you're at the top of your game, methodologically speaking. Someone will beat you to to the punch if you're using it to teach yourself methods. Better to start a learning project on data that you own.