Week 3: Reproducible research, sustainable software Flashcards
Best Practices for Scientific Computing
Scientists spend an increasing amount of time building and using software. However, most scientists are never taught how to do this efficiently. As a result, many are unaware of tools and practices that would allow them to write more reliable and maintainable code with less effort. We describe a set of best practices for scientific software development that have solid foundations in research and experience, and that improve scientists’ productivity and the reliability of their software.
- Write programs for people, not computers
you might publish your code, someone may benefit from using it, you might leave it and come back to it
- Automate repetitive tasks
if you have to do something many times write a function in a loop so that you only have to go through a thought process to get it to be done for you
- Use the computer to record history
log the changes that you make, keep track of all the versions
- Make incremental changes
small improvements where you can ensure things are working appropriately and every level
- Use version control
dropbox saves every version of every file for 30 days, you can restore old versions, this is very handy when you break or delete things by mistake. when working with several people you tend to not really know what the most up to date file is – systems to automatically merge files together (version control systems)
- Do not repeat yourself (or others).
reuse code, don’t reinvent the wheel, copy code
- Plan for mistakes
we know we are going to make mistakes, how do we detect mistakes, one way, test code (check as you code that everything is working), the best practice is that every code you have automatic tests run once you do the code. Extreme situations; give errors
- Optimize software only after it works correctly
biologists not so important we just need it to work and work correctly
- Documenting the purpose of code
Very tempting to write a small comment as you write code but rapidly you will be able to read the code the way it has been written – what is harder is to identify the general aim
- Conduct code reviews
really helpful to show what you have done to somebody else, fresh eyes can see what you might not have.
Use a style guide
Google has very good style guides that are respected by many.
Software Sustainability Institute
aim is to improve software made for science – they have done a lot of advocacy on how best to do things – “Better software, better research”
Eliminate redundancy
DRY; Don’t Repeat Yourself
Track versions of everything
GitHub; Facebook for code
Random people use your stuff, and find problems – fix and improve it!
Great impact – easily updated – easily collaborated – identify trends – build online reputation