Readings
20 posts
Sustainable computational science: the ReScience initiative
Title: Sustainable computational science: the ReScience initiative
Authors: Nicolas Rougier, Hinsen Konrad and others
Journal: PeerJ Computer Science
Publisher: PeerJ Inc.
Computer science offers a large set of tools for prototyping, writing, running, testing, validating, sharing and reproducing results; however, computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true.
James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested and are hardly ever actually executed to check that they produce the results advertised in the article.
ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests.
More: https://www.labri.fr/perso/nrougier/papers/10.7717.peerj-cs.142.pdf
Hardware Versus Software Fault Injection of Modern Undervolted SRAMs
Researchers from Barcelona Supercomputing Center (Spain) and Abdullah Gul University in Kayseri (Turkey) are sharing an approach to apply real under-volting SRAM fault maps to a simulated system and observe the resiliency of the applications.
They compare the hardware guided fault injection approach with a random guided fault injection approach. Significant differences appears in the coarse categorization of the resiliency of the application, which become more obvious as the number of faulty bits increases. There are also differences when inspecting the quality of the output among the two techniques. This is because in an realisticsystem not all fault locations have the same probability to present faults, therefore from the software perspective the faults can propagate to a limited number of software structures.
Corrective Commit Probability Code Quality Metric
An article signed by Idan Amit and Dror G. Feitelson from the Department of Computer Science at the Hebrew University of Jerusalem, presents a code quality metric, the Corrective Commit Probability (CCP).
This metric measures the probability that a commit reflects corrective maintenance. The authors think that this metric agrees with developers’ concept of quality, informative, and stable. Corrective commits are identified by applying a linguistic model to the commit messages. The team compute the CCP of all large active GitHub projects (7,557 projects with 200+ com-mits in 2019). This leads to the creation of a quality scale, suggesting that the bottom 10% of quality projects spend at least 6 times more effort on fixing bugs than the top 10%. Analysis of project attributes shows that lower CCP (higher quality) is associated with smaller files, lower coupling, use of languages like JavaScript and C# as opposed to PHP and C++, fewer developers, lower developer churn, better on boarding, and better productivity. Among other things these results support the “Quality is Free” claim, and suggest that achieving higher quality need not require higher expenses.
MongoDB, A Database For Document Stores
A potential prey for Oracle or Microsoft, MongoDB leads the document store market, and is now ranked #5 among all DBMS (source: DB Engines). It is at the heart of the DECODER PKM and also of multiple one-page websites based on the MEAN stack (Angular, MongoDB, Express, NodeJS).
In a recent article, Eric Weiss, Analyst at several large banks, sees MongoDB as the clear-cut leader within the high-growth, non-relational database SaaS sector. "MongoDB has been and will continue to be an indirect beneficiary of high-growth megatrends such as AI, Machine Learning, IoT (Internet-of-the-Things) and digitalization. Each of these trends have sparked an exponential growth in supply of unstructured data resulting in an increasing demand for (NoSQL) non-relational database solutions. Such databases can much more efficiently handle this new flow of data workloads compared to more traditional relational, SQL-based solutions".
- Read the article MongoDB, A Database For The New Era
- When it's time to create your first MDB database
- Upgrading to the release 4? Try this quiz on MongoDB 4 new features and database updates: MDB 4 quiz on Techtarget
Big Code has a direct impact on the business outcomes
For developers, code releases are "emotional" events. Many have fear and anxiety at the moment they release code or submit it for review and fear breaking dependencies.
Indeed, managing large and complex code bases (Big Code) can become laborious, time consuming and costly. Joe McKendrick article refers to a 2020 survey of 500 north American professional developers compiled by Dimensional Data and underwritten by Sourcegraph. The Emergence of Big Code survey highlights a dramatic growth in the volume and complexity of software code.
It's almost unanimous: 99% of respondents report that big code has a direct impact on the business outcomes of software development efforts. Challenges include less time for new hires to be productive (62%), code breaking due to a lack of understanding of dependencies (57%), and difficulties managing changes to code (50%).
Read the full article in ZDnet: https://www.zdnet.com/article/low-and-no-code-are-wonderful-but-a-big-code-world-lurks-underneath/
Machine Learning for Cybersecurity
Automated Vulnerability Detection in Source Code Using Minimum Intermediate Representation
Vulnerability is one of the root causes of network intrusion. An effective way to mitigate security threats is to discover and patch vulnerabilities before an attack. Traditional vulnerability detection methods rely on manual participation and incur a high false positive rate. The intelligent vulnerability detection methods suffer from the problems of long-term dependence, out of vocabulary, coarse detection granularity and lack of vulnerable samples.
This paper proposes an automated and intelligent vulnerability detection method in source code based on the minimum intermediate representation learning.
More..
MLOps Demystified
As Machine Learning at an organization matures from research to applied enterprise solutions, there comes the need for automated Machine Learning operations that can efficiently handle the end-to-end ML Lifecycle.
The goal of level 1 MLOps (see figure) is to perform continuous training of the model by automating the entire machine learning pipeline which in turn leads to continuous delivery of prediction service. The underlying concept which empowers the continuous model training is the ability to do data version control along with efficient tracking of training/evaluation events.
- Read MLOps Demystified, by Shubham Saboo (7-8 min read)
DevOps Market to reach $15 billion by 2026
The global DevOps market size is projected to reach $14,969.6 million by 2026, a compound annual growth rate of 19.1%, according to a Fortune Business Insights report. The report highlighted the significance of this increase, noting nearly +404% in eight years, as that market was only worth $3,708.1 million in 2018. Containerization, PaaS (Platform as a Service) and hybrid cloud are three major enablers in DevOps growth.
For more information:
- Read this DevOps market TechRepublic article
- Read another market study on DevOps from Grand View Research expecting $12.85 billion by 2025 with a close CAGR of +18.6%, or this related Medium article
Agile Testing + DevOps = DevTestOps
"DevOps is now really DevTestOps and for teams to be truly agile, test management is the vital link in the success of DevOps. You require TestOps to match the pace of DevOps and testing early and often — breaking the silos."
In fact, the World Quality Report 2019-2020 led by Capgemini shows that there is increased investment in the QA and Test function reported by 90% of US and 69% percent of Canadian survey participants in the past four years.
The Global Embedded Systems Market Expected to Grow +5% CAGR by 2024
An embedded system is a combination of software and hardware which together facilitate the accurate functioning of a target device. Embedded system market is expected to mark significant growth over 2019 to 2024 owing to increasing consumers spending on smart phones, providing high application- specified integrated circuit and high speed operating systems applications and technological advancement.
A recent Advance Market Analytics market study is being classified by Type (Normal Phase HPLC and Reverse Phase HPLC), by Application (Automotive, Telecommunication, Healthcare, Industrial, Consumer Electronics and Military & Aerospace) and major geographies with country level break-up. According to this study, the Global Embedded Systems market is expected to see growth rate of 5.28% and may see market size of USD536.2 Million by 2024.