sample-audit

Sample system audit

Introduction

This audit explores the basic technical aspects of the YOURNAMEHERE.com software system, as well as the development history and practices to date. The focus is on evaluating:

Overall assessment

The system is a small-to-medium Rails system (~10K lines of code, ~80-90 model classes). The system provides significant functionality to customers and seems to be fairly effective at delivering value to the current customer base. Certain parts of the domain are complex enough, and implemented to a sufficient degree in the system that re-writing the application from scratch would be expensive compared to improving the quality of the system in-place, given use of appropriate maintenance techniques.

The practices in use in development thus far do not support the evolution of the system to support more complex features, higher scalability, ease of bugfix / defect repair, and evolving the domain model to support future customer needs.

Collaboration tooling is under-utilized. Automated testing, and test-based refactoring practices are missing. Tooling and services related to exception reporting, metrics and telemetry, log collection, profiling, etc., are missing and would be required to improve quality, scalability, stability, and observability as the customer-base grows.

There are a few hotspots in the system which require attention, but much of the system seems relatively stable. The domain and database model should receive heightened scrutiny and there are certainly areas which will need refactoring to come into alignment with the actual domain model which best serves the customer.

Hotspot models suffer from the collection of too much functionality (low cohesion) and too much dependence on other modules (high coupling). Many methods are complex and could benefit from extraction or similar refactoring. The Law of Demeter (see below) is widely violated by methods reaching into foreign classes to extract specific data from deep within model associations (i.e., joins across many database tables). These methods are brittle and suffer from increased likelihood of intermittent failures due to missing intermediate data.

That said, overall the system is in better shape than many we have reviewed, but maintenance needs to be targeted at hotspots with a critical eye coupling and database design. Tooling and practices are a cause for serious concern and need to be addressed in the immediate time frame.

Recommendations

High priority recommendations:

Medium priority recommendations:

Low priority recommendations:

Resources

A number of books and resources are available that speak directly to improving the quality of the system over time. Particularly relevant are the following:

A book about how project work is undertaken at Basecamp (formerly 37Signals), the company from which Ruby on Rails was born. Some deep insights about what does and doesn’t work regarding feature selection, estimation, project cycle planning, risk management, etc.

Covers the process of understanding the customer domain and translating it into effective models in our systems. Introduces “ubiquitous language”, “bounded contexts” and relationships between bounded contexts in a software system. How do we transition a running system to be in tune with the model which actually serves the customer domain?

Systematic and insightful exploration of how to change complex running systems through the use of characterization testing, supplementing code with tests, refactoring to tests, etc. Critical techniques for improving complex systems with little to no test coverage.

Even with the tools and techniques available from “Working Effectively with Legacy Code”, how do we determine which larger changes are safe, systematically? The Mikado Method is a straightforward approach to navigating the complexity of change.

How do we look at the history of a system and gain deep insights about where complexity lies, our risks about defects, code ownership, knowledge loss, churn, etc.? The related CodeScene system makes these tools available for easy use, and for ongoing insights into what is or is not working when improving a system.

How do we use cryptographic techniques to store and utilize sensitive and private information inside our databases? Varying techniques for different levels of sensitivity and utility.

The model layer of the application suffers heavily from violations of the Law of Demeter. Cohesion can be increased and coupling decreased (and hence developer effort, churn, and defect rates are all positively affected) by adhering more closely to this principle.

Database overview

Entity-relationship diagram (ERD)

To install and run the rails-erd tool (on OSX):

brew install graphviz
bundle install
bundle exec erd

The .erdconfig configuration file used was:

orientation: vertical
indirect: true
inheritance: true

This generates the erd.pdf file linked here.

Discussion

It is typical for a customer-facing application to have models (persisted in database tables) which acquire complexity, size, and connections to other models in the system. Frequently, a person-related model, a core domain-related model (what is this application managing?), and some contextual models are involved (for instance, organization or location). This application is no exception to that trend. The primary models, reflected by tables in the database are the ModelA, ModelB and (survey) ModelC models.

These models are complex in their relationships to other parts of the system. They appear to also be harboring multiple concerns in-model which would ideally be represented separately. It is likely that hotspot/churn/refactoring analysis later will find that these parts of the system are the locus of bugs and increased maintenance work.

ModelA account information (for authenticating and controlling access to parts of the system) is embedded in the ModelB model, as well as being part of the ModelA model. There may be a missing abstraction that could be introduced to simplify the duplicate and muddy responsibilities with regard to people and their interactions with the system.

The concept of “archiving” information from season to season was important in discussions of the problem space, but only a couple of models (ModelD and ModelE) seem to actually record this information. The ModelF model seems to be the locus of determining if information is part of the “current” season, via its relationship. This is probably difficult to manage within the system’s logic, and there doesn’t immediately appear to be a way to determine seasonality for non-current information.

The notion of “active” (occuring in 8 places in the database) may be handling some of the weight of current/archived behavior, as well as “for now this option is manually disabled”. It would be desirable to have a well-defined meaning for active/inactive/current/archived and similar time-varying concepts in the system, that are known and enforced within the system.

The null: false idiom appears in a number of places throughout the database, but not in all places where it might be expected. It seems likely that defaulting and protecting against null values is probably inconsistently handled throughout the models. This could result in buggy and inconsistent behavior in various places in the system.

Guidance:

Rails stats

Statistics collected via rails stats at audit time:

bin/rails stats

+----------------------+--------+--------+---------+---------+-----+-------+
| Name                 |  Lines |    LOC | Classes | Methods | M/C | LOC/M |
+----------------------+--------+--------+---------+---------+-----+-------+
| Controllers          |   5325 |   4309 |     159 |     633 |   3 |     4 |
| Helpers              |    132 |    105 |       0 |      17 |   0 |     4 |
| Jobs                 |      2 |      2 |       1 |       0 |   0 |     0 |
| Models               |   6086 |   4115 |      82 |     496 |   6 |     6 |
| Mailers              |    123 |     93 |       2 |      16 |   8 |     3 |
| Channels             |     24 |     21 |       2 |       2 |   1 |     8 |
| JavaScripts          |   1804 |   1401 |       0 |     165 |   0 |     6 |
| Libraries            |    343 |    288 |       0 |       2 |   0 |   142 |
| Model specs          |    202 |    155 |       0 |       0 |   0 |     0 |
| Controller specs     |    134 |    102 |       0 |       0 |   0 |     0 |
+----------------------+--------+--------+---------+---------+-----+-------+
| Total                |  14175 |  10591 |     246 |    1331 |   5 |     5 |
+----------------------+--------+--------+---------+---------+-----+-------+
  Code LOC: 10334     Test LOC: 257     Code to Test Ratio: 1:0.0

Testing

As rails stats from above indicates, there are only 257 lines of testing code in the application. Very few actual tests exist. The suite does not run without error:

An error occurred while loading ./spec/models/abcdef.rb.
Failure/Error:
  RSpec.describe ModelG, type: :model do
    pending "add some examples to (or delete) #{__FILE__}"
  end

NameError:
  uninitialized constant ModelG
# ./spec/models/abcdef.rb:3:in `<top (required)>'

An error occurred while loading ./spec/models/abcdef.rb.
Failure/Error:
  RSpec.describe ModelF, type: :model do
    pending "add some examples to (or delete) #{__FILE__}"
  end

NameError:
  uninitialized constant ModelF
# ./spec/models/abcdef.rb:3:in `<top (required)>'
DEPRECATION WARNING: Passing string to be evaluated in :if and :unless conditional options is deprecated and will be removed in Rails 5.2 without replacement. Pass a symbol for an instance method, or a lambda, proc or block, instead. (called from <class:ModelH> at /ModelAs/rick/git/lronhodl/yournamehere/your-name-here-yournamehere/app/models/abcdef.rb:23)


Finished in 0.32156 seconds (files took 8.46 seconds to load)
0 examples, 0 failures, 2 errors occurred outside of examples

Conclusions: automated testing is not being utilized on this project.

Many modern techniques for safe bugfixing, feature improvement, and refactoring on production systems rely on the ability to utilize an automated test suite to gain confidence, avoid (re-)introduction of bugs, and not result in outages.

Guidance:

External services / components

Note: Twilio environment/credential reference is spread throughout various application files:

./app/models/abcdef.rb:    twilio_api_sid = ENV['TWILIO_SID']
./app/models/abcdef.rb:		twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/models/abcdef.rb:		twilio_sid = ENV['TWILIO_SID']
./app/models/abcdef.rb:		twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:		twilio_sid = ENV['TWILIO_SID']
./app/services/abcdef.rb:		twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:    twilio_phone_number = ENV['TWILIO_NUMBER']
./app/services/abcdef.rb:  		twilio_sid = ENV['TWILIO_SID']
./app/services/abcdef.rb:  		twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:    twilio_sid   = ENV['TWILIO_SID']
./app/services/abcdef.rb:    twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:		twilio_sid   = ENV['TWILIO_SID']
./app/services/abcdef.rb:    twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:		twilio_sid = ENV['TWILIO_SID']
./app/services/abcdef.rb:		twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:    twilio_phone_number = ENV['TWILIO_NUMBER']
./app/services/abcdef.rb:		twilio_sid = ENV['TWILIO_SID']
./app/services/abcdef.rb:		twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:		twilio_sid = ENV['TWILIO_CONSENT_SID']
./app/services/abcdef.rb:		twilio_token = ENV['TWILIO_CONSENT_AUTH_TOKEN']
./app/services/abcdef.rb:    twilio_phone_number_default = ENV['TWILIO_NUMBER']
./app/services/abcdef.rb:    twilio_sid = ENV['TWILIO_SID']
./app/services/abcdef.rb:    twilio_token = ENV['TWILIO_AUTH_TOKEN']
./app/services/abcdef.rb:    twilio_sid   = ENV['TWILIO_SID']
./app/services/abcdef.rb:    twilio_token = ENV['TWILIO_AUTH_TOKEN']

There is almost certainly a missing abstraction which could encapsulate and mediate access to Twilio for the rest of the system.

Note: A commendable effort has been taken to ensure that credentials (passwords / secrets) are not stored in the codebase. Using Heroku as a service provider makes sticking to this practice easier, but not fool-proof. One of the common pitfalls for web applications (credentials management) has been avoided here.

Missing services

The following service capabilities would be highly useful for maintaining and monitoring the system but do not appear to be present:

Guidance:


Code analysis

This analysis was undertaken using the processes and tools outlined in the following books:

This audit used CodeScene to automate the analyses of the historical activity on this project. The full codescene analysis is available online, and access should be straightforward to set up.

Organizational

There are 10 contributors to the project since its inception. Three of those contributors are responsible for the bulk of the code in the system.

Over time, the amount of work contributed, as measured against the number of developers actively working on the project looks like this (see CodeScene documentation for more background on this graph):

This seems to tell us that the project has a smaller set of core contributors now than at its outset, and at least, in a pure volume sense, there is as much work happening now as ever. A different sort of analysis can tell us whether developers are churning, or whether it’s easy to make progress.

What does productivity look like, with more detail, over the most recent month (available at github pulse)

Work is happening, lots of issues are closing, some opening. It’s not easy to see from this view whether that means problems are being solved, or issues are being closed for other reasons.

Observations: Pull requests are not being utilized effectively. GitHub tooling, in general, is not being effectively utilized:

Guidance:

All of these items are HIGH priority:

Code authorship

(Source) Zooming into the “models” portion of the application (where much of the domain knowledge resides in the system), three of the most connected models identified in the database review are highlighted here. Two of these models are the largest models in the system, likely implying (expected) complexity in those files. Note that two developers are responsible for being primary authors of the bulk of the model code. The variations in primary author over various files imply that coordination is necessary for changes to these core models.

“Hotspots”

Hotspots are modules in the system where complexity is high and also the rate of change is high. This has implications for the number of bugs (they tend to be high), as well as the difficulty of changing the code in these areas, and the amount of coordination that needs to happen between developers to do work there.

(Source) Here is the 10,000-foot overview of hotspots in the system. Some of the files which are hotspots are parts of the project infrastructure (the schema file which tracks all changes to the database design, the routes file which maps every URL on the app to its related code, and so on). The rest of the files (highlighted in darker shades of red) are areas where we expect to see developers churning to try to fight bugs and reliably add features.

Zooming in to the model area, let’s see which models are causing headaches:

This comports with our earlier suspicions, with ModelA and ModelB difficult areas in which to make changes.

Another area where hotspots are showing up is the “services” portion of the application. In particular the “reply processor” and the “SMS processor” are both worth investigating more deeply and seeing if there are ways to refactor and extract responsibilities from this code:

But does this matter?

The ability to use the patterns of development to identify hotspots, and to use complexity information (how complex is this code), as well as metadata around issues being addressed, allows the computation of defect impact contribution from hotspots. A simple summary:

So, 6% of the code in the system is garnering 22% of developer effort, and can be estimated to be responsible for almost a third of all bugfix effort.

Refactoring targets

CodeScene identifies a set of refactoring targets to help guide where cleanup effort is most likely to positively impact defects and developer velocity. ([Interactive Source])(https://codescene.io/projects/000/jobs/000/results/code/hotspots/system-map) and Text list)

Ignoring two false positives (routes.rb and schema.rb, which are necessarily involved in many changes in Rails applications) the prioritized refactoring targets list reads as:

The two files noted at the top of the list as high priority are the ModelB and ModelA models which we’ve suspected previously as being implicated in defects and difficulty in changing the system. A deeper dive is available via the “X-Ray” functionality in CodeScene, which points to specific methods in these files which are complex, and coupled to other methods in the same file and others – and hence which are difficult to maintain and require the need for coordination to avoid introducing defects.

ModelB and ModelA “X-ray”

Looking more deeply into complexity and churn within these hotspot modules, we see that there are a number of problematic methods (likely involved in defects and good targets for refactoring-under-tests):

In the ModelB model (Source):

In the ModelA model (Source):

Choosing a couple of selected methods for a closer look:

ModelB::happiness_popups

#### code would appear here!

Here we see fairly high complexity, and this method has been frequently changed. References to specific data in other models could be de-coupled by giving appropriate names and moving the accesses to tested methods on the interface of the other models. There is likely a missing service class around the various Foo-related calculations.

ModelA – date calculations

#### more code would appear here!

In the ModelA model there is a lot of similar but varied calculations around dates, scheduling, etc. This speaks to an opportunity to refactor related code, and most likely to introduce service classes that mediate how the system works with dates and timings.

In many places we see violations of the Law of Demeter and of model encapsulation, where chained methods are reaching deeply into multiple models to ferret out data. For example:

ModelA coupling

#### More code!

Digging through the results of hotspot X-ray analysis is useful in identifying areas in the system where targeted refactorings and cleanups can happen, especially when the next opportunity to revisit that code for a feature or bugfix comes up.

Miscellaney

A few other notes are in order from spot-checking parts of the system looking for odd code:

Low-level system considerations

Installing the application locally

Note: I am using rbenv on my local (Mac OSX) system.

brew install postgresql   # possibly optional
pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start   # possibly optional
rbenv local 2.5.1
gem install -v 1.16.6 bundler
bin/bundle install

Running the application locally

Note: Instructions in README.md at repository root give incorrect database configuration guidance.

Specifies:

$ cp config/database.postgresql.yml config.yml

instead of:

$ cp config/database.postgresql.yml config/database.yml

To start the application:

rails db:create db:schema:load db:seed
rails server

Warning: The local development instance has a hard dependency on the version of NodeJS available on the host. If running an out-of-date version of note, accessing the server at http://localhost:3000/ will return an error page containing a message like:

RuntimeError in Sessions#new_ModelA_session
Autoprefixer doesn’t support Node vXX.XX.XX. Update it.

Updating NodeJS version (OSX):

sudo npm cache clean -f
sudo npm install -g n
sudo n stable

Then the application may be started successfully and accessed via http://localhost:3000/.

Note: It is unclear which credentials can be used to log in to the local development environment.

Note: In brief navigation I quickly encountered error pages (thrown exceptions) navigating in the development environment. For example:

Accessing: http://localhost:3000/some_wizard/phone_number to start a survey, the following exception was thrown:

NoMethodError in SomeWizard#show
undefined method `message' for nil:NilClass

The error is occurring in this line:

              <%= Translation.translate_thingie((@thingie.language || 'English'), @that_thing.message(@ModelB)) %>

Which leads me to believe that the translation API may not be configured properly on initial setup.

Guidance:

Software version notes and upgrade guidance

Ruby interpreter

Ruby version (as specified in Gemfile): 2.5.1 – security vulnerabilities, upgrades required.

This ruby version has been superceded in the 2.5 series by:

Guidance:

gem install rdoc -f

Ruby on Rails

Rails version: 5.1.7 (via ~5.1.2)

This is the most recent Rails version of the 5.1 series, and as such includes all relevant security fixes and patches. There are security patches in the 5.1 line, but heroku should have deployed the 5.1.7 version for the app some time ago.

There are later versions of Rails: the 5.2.* series, and the 6.0.* series. Upgrades have gotten easier over the years, but there is no reason to rush an upgrade at this point.

Note that README.md references an outdated version of Rails, and should be updated.

Guidance: Upgrade README.md language to reference Rails version 5.1.7 instead of 5.1.3. Priority: Low

Ruby gem libraries

Known vulnerabilities

Running a vulnerability check via bundler-audit:

gem install bundle-audit
bundle audit check --update

Reported vulnerabilities:

Insecure Source URI found: http://rubygems.org/
Name: nokogiri
Version: 1.10.2
Advisory: CVE-2019-5477
Criticality: High
URL: https://github.com/sparklemotion/nokogiri/issues/1915
Title: Nokogiri Command Injection Vulnerability via Nokogiri::CSS::Tokenizer#load_file
Solution: upgrade to >= 1.10.4

Name: nokogiri
Version: 1.10.2
Advisory: CVE-2019-11068
Criticality: Unknown
URL: https://github.com/sparklemotion/nokogiri/issues/1892
Title: Nokogiri gem, via libxslt, is affected by improper access control vulnerability
Solution: upgrade to >= 1.10.3

Name: rubyzip
Version: 1.1.7
Advisory: CVE-2017-5946
Criticality: High
URL: https://github.com/rubyzip/rubyzip/issues/315
Title: Directory traversal vulnerability in rubyzip
Solution: upgrade to >= 1.2.1

Name: rubyzip
Version: 1.1.7
Advisory: CVE-2018-1000544
Criticality: Unknown
URL: https://github.com/rubyzip/rubyzip/issues/369
Title: Directory Traversal in rubyzip
Solution: upgrade to >= 1.2.2

Vulnerabilities found!

Guidance:

Warnings

A warning is emitted on bundle install:

Ruby Sass has reached end-of-life and should no longer be used.

* If you use Sass as a command-line tool, we recommend using Dart Sass, the new
  primary implementation: https://sass-lang.com/install

* If you use Sass as a plug-in for a Ruby web framework, we recommend using the
  sassc gem: https://github.com/sass/sassc-ruby#readme

* For more details, please refer to the Sass blog:
  https://sass-lang.com/blog/posts/7828841

The ruby-sass gem is dead. This blog post gives guidance on moving away from ruby-sass.

Guidance: Consider moving away from this gem. Unless Rails version is upgraded this is not likely to immediately cause issues. Priority: Low

Git-pinned gem resources

Gems specified in Gemfile which are being pulled from GitHub revisions instead of upstream Gem servers):