The Dark Side of Software 4 Science

01 January 1970 21 3K Report

I have been trying, for five years, to get information about practices and methods into scientific software development, also called software for science or software for research, more formerly research software engineering.

It seems to have a dark side because systematic, disciplined and measurable tasks are not present, and worst, any formal approach in order to get confidence data through quantitative or formal qualitative methods has been no answer.

Please give me your opinion, perceptions about this dark side. If you have been part of a software development team, even a one-person team, please let me know your practice answering this short questionnaire. It includes a checkbox for knowing the results (https://goo.gl/aJ2Sj1).

Kevin W Jameson Popular answer

Consider looking up the Software Carpentry project. It's been running for 20 years in various forms and is intended to help scientists write and share better code. The goal is help scientists do more science and spend less time on software (less designing, coding, testing, etc.)

Dudley J Benton

In the past 40+ years I have written approximately 3 million lines of code (assembler, FORTRAN, and C). I developed hundreds of computer programs, many of which are available for free at http://dudleybenton.altervista.org/index.html I have never sat through a single class in computer programming, nor do I have a degree in computer science. I got out of taking the required class by submitting the smallest program ever developed to solve the knight's tour (64 moves landing once on each location of a chess board). The entire self-contained executable file is only 1665 bytes and requires no DLLs. You can download it including source code (or the 3D version) from the preceding site. My degree is in engineering and I like to consider myself an applied mathematician. I've always made a living solving problems. Sometimes I write software to accomplish this. I rebuilt my first engine without adult supervision at age 11 and started out as a mechanic. Doing front end alignments is what convinced me to pursue college. The reason you're getting strange answers to questions about software development is like the old saying: Those who can, do. Those who can't, teach. Those who can't teach, teach teachers to teach.

Hannah A Chappell

In undergrad, I was a double major in Comp Sci and Biochem. I wrote code to help me to understand DNA and to recompile strands. I also worked some on a project that used software to help predict secondary structures of proteins.

I believe the reason this is a "dark" area is because most software engineers do not possess deep knowledge of scientific principles and theories, perhaps filling this gap will help.

Dudley J Benton

The first big successful computer program I wrote that was used internationally modeled evaporative cooling towers. Before writing the program, I examined every part of several cooling towers, climbed a few of the really big ones, and talked to a number of people who designed and repaired them. Before modeling submerged multi-port diffusers, I spent weeks taking field measurements and talking to divers and technicians. Before modeling contaminant transport, I interviewed several outstanding geohydrologists and had them explain the processes in detail. Before modeling an MSR (specialized heat exchanger used in nuclear plants), I crawled inside one. It's not enough to know coding. You must also understand how the software must work. There is no substitute for hands-on experience. You can learn a lot from people who know nothing of computers, but have years of practice in the field.

Dennis Hamilton

This is a very odd thread. I don't doubt that scientists are adept in their experimental and observational abilities. But one lament I hear is that software produced as an instrument of a scientific investigation is often poorly engineered in software engineering terms, in terms of it being amenable to use in replications and adaptation to similar projects. And there is always the issue of verification that the software conforms to the scientific model it is intended to reflect. Sometimes it comes down to defective Excel spreadsheets :).

I would not expect general software engineers to have a grasp of particular scientific models and the valid application and *interpretation* of statistical approaches,.

So yes, in science and elsewhere there is need for collaborative development of a conceptual model by which the software is suitable to the scientific endeavor.

The use of anecdotal experience is interesting. One thing these comments seem to show me is (1) software engineering is misunderstood by many scientists, and also many computer programmers, and (2) expertise in one field is not evidence for expertise in the other, although we are talking about experiences of smart individuals. (In this regard, the term/title Software Engineering and Computer Science with respect to disciplined activity are often corrupted in these contexts.)

PS: The title chosen by Bob Glass and Johann Rost is intentionally provocative. But notice that they are addressing the management of software projects, certainly not software engineering and project management as it is intended to be practiced, especially in regard to risk management. That there are ethical issues in many settings and conduct of software projects is not something that should be attributed to software engineering any more than industrial or academic research should be so painted.

Janis Osis

This question concerns the weakness of software engineering. There doesn't exist a common theory of software development and people are using problem specific methods like in this particular case..

For researchers who are interested in the ways trying to improve the software development I recommend:

J. Osis, E Asnina. Is Modeling a Treatment for the Weakness of Software Engineering? In: Intelligent Systems, Concepts, Tools and Applications. IGI Global, 1918. DOI: 10.4018/978-1-5225-5643-5.ch01.,

and

J. Osis, U. Donins. Topological UML Modeling: An Improved Approach for Domain Modeling and Software Development.Elsevier,2017, p.276.

Dennis Hamilton

Janis Osis, I think it is about the weaknesses that arise in a specific context and communities of practice. Making this about generic software engineering and some sort of weakness in that companion discipline is off the mark. I don't expect that to be fruitful. If I were to claim that the situation is entirely about a weakness in use and understanding of software by scientists would that be acceptable? I hope not.

I am more inclined to see this as a failure to apply software engineering practices that are relevant to the goals of the scientific undertaking. It would be interesting to elevate the anecdotal accounts here to identification of what qualities were agreed to be desirable regarding "scientific software" and how that was assessed and achieved. It would also be interesting to know what expertise and guidance was relied upon in the outcome, whether successful, unsuccessful, or mixed result. I would be interested in what was measured and how it was managed.

(I must say that this does bring to mind the questioning of craftsmen who blame their tools or relied-upon services. I agree that's a cheap shot [;

Ali Rezaee

strange and deeply misleading answer by Dudley J Benton

But about the question...

The software is an intellectual product and as of any other members of this family, the theoretical foundations of its MEANING, EFFECTS, and CONTEXTS are not formalized yet.

Writing some working code is something and study of software engineering foundations is another thing. The works of sir C.A.R Hoare in formalizing the computing systems along with many other great contributors were dedicated to the theoretical aspects of software.

The golden decade of software architecture (1990) and the major works accomplished by Dr. Garlan, Dr. Shaw and many other researchers are generally directed by the need for dependable, abstract and verifiable models as the master blueprints for complex softwares and systems of systems.

So generally, I do not agree that there is a dark side.

Dennis Hamilton

I completed the questionnaire for my 1958 first programming project,since it was for a scientist and his results were published. I am not certain how the questionnaire ties to the topic of this discussion, but it was interesting to submit it. There was no recognition of either software-engineering or computers science at the time, of course, and at 19, I had much yet to learn. Nevertheless, the principal investigator was happy with the results and I must have satisfied his requirements. He had no knowledge of software development but did arrange to provide the data I used and the outputs passed whatever smell test he applied.

My current career-capstone project (here in my 60th year in computing) is also scientific, but the software is not completed yet. I might return when I have answers. My approach is very different, starting from mathematical formulation of a model of computation I am investigating. Having the proof-of-concept software and its verification available to others is an important result.

Kevin W Jameson

Milan Malej

Control of human intentions

I will try to give my view as lifelong engineer of c.s. and two decades "old" philosopher in environmental filed.

The technical and scientific SW is mostly the last stage of engineering projects (control of machines and electronic devices, connecting measurement systems, representation of microscopic or remote structures…). So SW must mostly follow the concepts of "material" part of projects. If intended for mass production, then the user interface will be adapted to applications of the same range or operational systems. If not, it is only on scientists, engineers and very specific users (who can learn) to understand the particular SW functions. It is the pressure of the capital and time in the background that product or research must bring money or results as soon as possible. This results in poor SW architecture and methods, coding "just to work", avoiding tests in unexpected and marginal situations and using already programmed blocks and libraries, not knowing what exactly they do…

I work on concepts of long lasting and quality products, to lower turnover of materials and quantity of waste and to achieve near 100% recycling. This includes possibilities of upgrading and modular built products and standardization of materials and constructions. So, when this will be accepted and applied, then the concepts, rules and solutions of technical and pre-written SW could be developed much more.

The second obstacle is too fast technical and scientific development, which cannot be followed by society, social and juridical norms. The final catastrophic effects of such science behavior hit the nature. The system leaves to the individuals, to their abilities on the market, to the expressions of self-importance and to the crowd effects, how the software will find ways from research phase to application. Because only the final effect is important, no one asks, what is between.

It would have to be always the society norms and research on that level about the social consequences (like of "social" networks, smart phones, genetic abuses…) necessary, before allowing certain technical and bio-technical research to be started and results transferred into application. Then society and juridical research would also have to set the frames and methods of technical research and not opposite as happening today (source: Alliance for responsible, plural and united world). It is the self-and-power-taken freedom of science and capital, which take over the self-controlled freedom in naturally, socially and spiritually oriented societies.

There is one big gap in SW applications, which would have to join juridical, social, environmental, philosophical and technical views. This is control upon legislation execution, in first line of the constitutional, human right and environmental declarations. The tasks of such SW are described already several years in my book What after Democracy?

https://www.researchgate.net/publication/322330927_WHAT_AFTER_DEMOCRACY_Searching_basic_principles_of_human_life_on_the_Earth

And the development of the studying version is going on in this year, including development of text and data structures for appropriate conversion of laws into principles and rules, following with similar records for projects. The basic task of this control SW is comparing if the parameters, defined in projects, match with those in rules. The defined paths of consideration rules at certain types of projects will assure equal, non-privileged and non-corrupted evaluation of all running projects and planned intentions, safeguarding in first line the natural, environmental and ecologic systems, resources and food production and unfair exploitation of human work. In fact this SW demands quite other way of defining legislation, first of all clear, simple and transparent.

That is what humanity and the planet need from the science today, and not purchasing the public funded knowledge to the interests of capital groups, irrespective if chaotically written or based on the same principles.

Dennis Hamilton

Kevin W Jameson link to Software Carpentry https://github.com/swcarpentry/swcarpentry

That is an useful project. It is about nuts and bolts with a *nix command-line and Python bent. Nice approach to tooling. There is a good fit with current trends in computing for science.

I am reminded of Fred Brooks' "No Silver Bullets" theme. That is probably worthwhile in the context of this thread. Brooks opined that the current problems of software development are not about how to build software (e.g., at the level of software carpentry), but knowing what to build. And management of its achievement, I would add.

I have no idea what Brooks might say about the current rush to "AI" and Machine Learning. Maybe smelling the smoke of yet-another-silver-bullet missing its target.

If someone wanted to get started in developing some software as part of a scientific research effort, I think the Software Carpentry introduction to some fundamental tools would be a good place to start. It is an useful level of fluency. It won't help understanding how scientific computing projects fail, or what else is needed, but it won't hurt either.

Also, many MOOCs and other on-line activities feature this particular level of tooling, it being a favorite in academic curricula.

Angie Hinrichs

In the field of bioinformatics, much scientific software is written by graduate students who are trying to answer a research question. Many are fresh out of undergraduate programs and have no software engineering training. The requirements of the software often change as they make progress in their research. When a graduate student is doing all of the software work on a problem that may not yet be completely defined, the model of "write a formal requirements document up front, then compose a formal model, then code" seems like a fairy tale. They write some code (or modify the work of a former grad student), run it on what data they have, meet with their advisor, repeat.

There are some graduate students and postdocs who do have real-world software experience and manage to write portable code, package it for easy installation, and document it sufficiently for someone else to use it. In a few cases their programs become very widely used in subsequent research, for example sequence alignment tools. Whether they were constructed with formal methods, who knows, but everyone hopes that a peer-reviewed publication means that the software is correct. :)

I think the majority of such projects just die when the graduate student/postdoc moves on.

I have the good fortune to work on the UCSC Genome Browser team, which includes almost as many QA/testing/user support engineers as developers. Our PI (Jim Kent) has a professional software background and knows the value of QA. We can afford to hire them because we are a very widely used resource and therefore (knock on wood) well-funded. If only all scientific projects could be sufficiently funded to hire experienced software engineers and testers! But I don't think that's realistic, as long the scientific research enterprise is a system of indentured servitude for grads and postdocs.

There's your dark side -- indentured servitude.

And as others have said above, the Software Carpentry organization is doing very valuable work to bring practical software engineering to scientific software development.

Kevin W Jameson

I remember that long ago (probably when it first got started) the Software Carpentry project (out of JPL? LLNL?) held a worldwide opensource design contest to build better software tools. There were five categories of tools, if I remember correctly (design, build, debug, etc.). One of them was for software build tools to replace the hated (but the best we have) Unix make tool from the 1970s (and the N derivations of make in different languages with different names such as Cmake, Nmake, etc.) My little commercial company at that time had solved the build problem by automating it at scale. We generated something like 500 fresh makefiles a day on 20 platforms for a million lines of code in about 10 languages (C, C++, bash, sh, perl, java, etc.). Then we automatically built all that code in instantaneous builds, checkin builds, QA rollover builds, parallel builds, etc. We achieved six sigma failure rates (< 3.4 build failures per million) on our baseline 1M code, even with about 20 programmers changing code every day. The only way that worked was to remove all human labor from the process (which we did), since humans (human processes) have a 3.5 sigma failure rate (e.g. even hospitals have a 3.5 sigma (around 7%) failure rate. I wrote a book and a few papers on the technology and gave conference talks for a few years.

The SC project invited me to submit a design to the contest, but I couldn't because I was commercial. I read 85 submissions to the contest and the best the prize winning design could do was rewrite make (I think the Perl make) in Python. So, basically no progress was made at all, except to move to Python.

In case you're wondering, we subsequently were funded by Silicon Valley venture capitalists and received a new CEO who ran the company into bankruptcy within 2 years by taking on unsolvable technical problems, with me protesting all the way. But, to no avail. The company went bankrupt and the tech was sold to another company who wanted my C-language engineering team. They archived the build tools almost immediately. Although the company (via my ex-right-hand tech guy) has said they are willing to make the tools open source, they haven't bothered to find and upload them from their archives. So, I've watched the world suffer through the usual build problems for 20 years or more. My friend still complains about make, Cmake, ...

There's a dark side story for you! :-)

David Ludwig Sieving

Scientists other than computer scientists in their eagerness to solve scientific problems often don't appreciate until they're in quite deep the benefits of formalized software development methodologies, of which there are many. Less formalism is often more in these contexts if parsimony and brevity make the software more easily maintainable, as this allows more focus on the scientific problems to be addressed. Trouble tends to accumulate, by contrast, when conventional programming methods are used in control algorithms which don't scale well as the full inherent complexity of the problem domain is realized over time. Eventually such projects reach a tipping point after which they have to be redone from scratch in order to remain viable and relevant. Where there is a scientist with a will there is usually a way to make legacy software work but the value going forward of making it understandable for newcomers is often not appreciated early on. There are many dark sides to software engineering but I would not place this one among them. Science is about shedding light on real-world problems and the software that aids these goals is usually well-intended.

Bruno Martin

Here is a paper I found RIGHT ON your subject: "What Scientists and Engineers Think They Know About Software Engineering: A Survey". The basic reason is a lack of knowledge in software engineering in some closed scientific communities who have to work develop, maintain and use complex software.

Hope this gives you some food for thought!

Cheers!

Fabian Nick

I pretty much agree with what David said.

Just let me add that I find the issues raised here to be particularly "dangerous" for scientific software that is being used

a) by more than one person or

b) for a longer period of time than originally intended or

c) on different architectures, especially in the field of HPC, that also might change over time (e. g. a code base that was started 20 years ago never would have thought about many core systems in the beginning).

Many codes I guess start of as a one person project and then might grow bigger. In this step of growing bigger, most of the time non-software-engineers work on the code and unfortunately, many scientist don't value software engineering that much. Therefore most scientific software ends up being very difficult to use, maintain and understand. And especially the understandig bit also makes it hard for many projects to get new people to join the team.

Having said that, I think there's also a number of scientific codes that are also good pieces of software from the engineering point of view. So I'm not saying the situation is as bad everywhere.

Lastly, please let me add to the list of valuable resources here:

- Software Carpentry (already mentioned)

- Various RSE associations like https://rse.ac.uk/ (Slack channel: https://ukrse.slack.com/)

- Better Scientific Software: https://bssw.io/

David Ludwig Sieving

Bigger budgets for science projects would enable them to hire more software specialists. They would also have to see the need to do so, and to avoid excess when less will do just as well and when more might distract from a focus on the scientific objectives. The managers should also be scientists to help ensure the proper focus and balance.

Dennis Hamilton

@Bruno Martin, concerning "What Scientists and Engineers Think They Know ..." the link to a reading list is now obsolete. I obtained an usable one from Roscoe Bartlett: https://bartlettroscoe.github.io/readingList.html

Fabian Nick

David, I'm not too sure if bigger budgets would actually help much in the scientific world because from my perception it seems like Software Engineers currently aren't accepted as "scientists" by many "classical scientists", which I guess makes it difficult to hire enough of them on a science-budget. I hope that this will change ASAP but currently I think it's just another one of those hoops you need to jump through..

Dennis Hamilton

This event may be of interest around evolving ideas in Software Engineering and the ideas around certain kinds of methods. Whether or not one finds Jacobson's preferences appealing, I think the notion of raising the level of fluency and competence in practice is relevant.

The registration link I received is https://event.on24.com/wcc/r/1797723/B74CAD05FEF193B794A8E661185A79DF

It should work for anyone.

This is at a rather different level than how software engineering (pro and con) has been portrayed in this discussion. It is interesting to me to learn how comprehensible Jacobson's ideas and account are to practicing scientists interested in having durable computational artifacts.

- Dennis

📷Register now for the next free ACM SIGSOFT Learning Webinar, "50 Years of Software Engineering, So Now What?" presented live on Thursday, December 20 at 12 PM ET by Ivar Jacobson, Founder and Chairman of Ivar Jacobson International. Pekka Abrahamsson, Professor of Information Systems and Software Engineering at University of Jyväskylä in Finland, will moderate the talk.

(If you'd like to attend but can't make it to the virtual event, you still need to register to receive a recording of the webinar when it becomes available.)

Note: You can stream this and all ACM SIGSOFT Learning Webinars on your mobile device, including smartphones and tablets.

Software Engineering was the theme of a 1968 conference in Garmisch, Germany, with the leading computer scientists and methodologists (at the time) in the world. That meeting is considered to be the beginning of software engineering and by now we have developed the discipline over 50 years.

"This is not the end, it is not even the beginning of the end, but it is perhaps the end of the beginning" (Winston Churchill).

We are more than 20 million software developers on the planet, with a large number of methods to develop software. However, the most successful recipe for success is a method that focuses on hiring the most brilliant people in the world and empowering them to create wonders. 50 years ago, Ericsson in Sweden did that. Now Apple, Google, Amazon, etc. do that.

What about the rest of the world? – banks, insurance, airlines, defense, telecom, automotive, etc. How can we get these industries to be more innovative and develop better software, faster, cheaper, and with happier customers? How can we do that given that the state of the art of our discipline is in such a chaos, characterized by the multitude of competing methods out there?

The most powerful way to help the rest of the world to build excellent software is to dramatically increase the competency (and skill) of all of us. There are no shortcuts. Education must start from an understanding of the heart of software development, from a common ground that is universal to all software development endeavors. The common ground must be extensible to allow for any method with its practices to be defined on top of it. This would allow us to sort out the chaos and to increase the competency of all of us. As a plus, that competency increase wouldn't hurt the brilliant people, but make them even more productive than today.

In this presentation Dr. Ivar Jacobson will revisit the history of methods, explain why we need to break out of our repetitive dysfunctional behavior, and introduce Essence: a new way of thinking that promises many things, one of them being to dramatically change the way we educate in software development to increase the competency in our profession.

Duration: 60 minutes (including audience Q&A)

Presenter: Ivar Jacobson, Founder and Chairman, Ivar Jacobson International Dr. Ivar Jacobson received his Ph.D. in computer science from KTH Royal Institute of Technology, was rewarded the Gustaf Dalén medal from Chalmers in 2003, and made an honorary doctor at San Martin de Porres University, Peru, in 2009. Ivar has both an academic and an industrial career. He has authored ten books, published more than a hundred papers and is a frequent keynote speaker at conferences around the world. Ivar is a father of components and component architecture, work that was adopted by Ericsson and resulted in the greatest commercial success story ever in the history of Sweden, and it still is. He is the father of use cases and Objectory, which, after the acquisition of Rational Software in 1995, resulted in the Rational Unified Process, a widely adopted method. He is also one of the three original developers of the Unified Modelling Language. But all this is history. Ivar founded his current company, Ivar Jacobson International, which since 2004 has been focused on using methods and tools in a smart, superlight and agile way. As a result of that work, Ivar became a founder and leader of a worldwide network, SEMAT, which has the mission to revolutionize software development based on a kernel of software engineering. The kernel has been realized as a formal OMG standard called Essence.

Moderator: Pekka Abrahamsson, Professor, University of Jyväskylä, Finland Dr. Pekka Abrahamsson is a tenured professor of information systems and software engineering at University of Jyväskylä in Finland. He is also the director of the startup lab at JYU. Prior to his current position he has served as a full professor in NTNU in Norway, Free University of Bolzano in Italy, and University of Helsinki in Finland. His research interests are in Essence in software engineering, software startups, and ethics of artificial intelligence. He received the Nokia Foundation award in 2007 for his achievements as a software researcher and he was awarded a Top-100 Most Influential Scholar in Software Engineering in 2016 award by Arnetminer. Dr. Abrahamsson received his Ph.D. in Software Engineering in 2002 and started his career as a software developer and a quality manager.

Badges
Science topic

What are the long-term impacts of incarceration on youths' developing brain?

I want to explore the long-term effects of incarceration on a youth's developing brain. I also want to explore research that looks critically at incarceration and punitive measures as the primary...

12 August 2024 862 0 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Geotechnical Engineering (Proceedings of the ICE) time review?

Hello everyone, I recently submitted an article to Geotechnical Engineering (Proceedings of the ICE), and the current status has been listed as "EiC Pre-assessment: Ready" for the past 20 days. I...

10 August 2024 6,493 1 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

Is this a facetotecta nauplius?

This larva was captured using a plankton net in the Persian Gulf during the summer. I believe it may be a Facetotecta nauplius.

08 August 2024 3,746 4 View

Request Python code?

Request Python code from this article : Gender equity of authorship in pulmonary medicine over the past decade. THANKS!

08 August 2024 6,242 2 View