Design and Implementation of compleX Information Systems

Write

Technology ruby on rails optimization
[RAILS] optimizing a slow request

We identified a single slow request in our moderation module: retrieving the json containing the entries to be moderated, feeding our react app.

So in short our datamodel looks as follows:

Class Topic 
  have_many :entries
  have_many :dynamic_attributes

Class Entry
  belongs_to :topic
  has_many :entry_values

Class EntryValue
  belongs_to :entry
  belongs_to :dynamic_attribute 

So in short: topics have a set of (dynamic) attributes that can be entered. An entry_value is the value entered for a dynamic_attribute and those are grouped in a complete entry.

In the moderation, our moderators verify that an entry (collection of entry-values) is appropriate, and have to option to edit or add missing information.

So in our controller we do something like

@entries = @q.result(distinct: true).page params[:page]

We are using ransack to filter/search on our entries. So we know we need entry-values, and possibly their dynamic attributes when building the json, so what is the best approach here? Use eager_load or includes? So what better way to decide than actually test this? So I temporarily changed the controller code as follows:

        preload_option = params[:pre].try(:to_i)
        if preload_option == 0
          @entries = @q.result(distinct: true).page params[:page]
        elsif preload_option == 1
          @entries = @q.result(distinct: true).eager_load(:entry_values).page params[:page]
        else
          @entries = @q.result(distinct: true).includes(:entry_values => :dynamic_attribute).page params[:page]
        end

Note:

  • I can only eager_load one level deep
  • with includes I can immediately fetch all dynamic attributes for the entry-values.

But which will prove to be more beneficial?

So then I ran a small benchmark script:

require 'benchmark'
require 'rest-client'

n = 10
Benchmark.bm do |x|
  x.report("normal ") do
    n.times { RestClient.get("http://admin.lvh.me:3013/moderation/projects/11/entries.json", {}) }
  end
  x.report("eager  ") do
    n.times { RestClient.get("http://admin.lvh.me:3013/moderation/projects/11/entries.json?pre=1", {}) }
  end
  x.report("include") do
    n.times { RestClient.get("http://admin.lvh.me:3013/moderation/projects/11/entries.json?pre=2", {}) }
  end
end

(note: for testing purposes I also disabled the need to authenticate, so I could easily fetch the jsons and time and compare)

This first run gave me the following results:

    normal   0.016362   0.005723   0.022085 ( 35.440171)
    eager    0.010036   0.004336   0.014372 ( 28.632490)
    include  0.012550   0.004061   0.016611 ( 29.173778) 

Ok. Not the kind of improvement I had hoped. Also nice to notice that eager_load in this case is more efficient than using the includes (which seemed a little counter-intuitive maybe).

I had recently changed a small part of the code, because in the moderation we also wanted to be able to edit fields that were not entered, and before we only had to retrieve entered values (:entry_values) so I presume that maybe there I fucked up the performance. Before we called entry.valid_entry_values which looked like

  def valid_entry_values
    entry_values.sorted.select do |ev|
      da = ev.dynamic_attribute
      da.attribute_type != 'item' || (da.attribute_type == 'item' && !ev.item_content_type.nil?)
    end
  end

and I replaced it with the following, adding empty entry-values to be filled in:


  def entry_values_with_empty
    result = []
    self.topic.dynamic_attributes.each do |da|
      ee = entry_values.find_by(dynamic_attribute_id: da.id)
      if ee.nil? || (da.attribute_type == 'item' && ee.item_content_type.nil?)
        ee = entry_values.build(dynamic_attribute: da)
      end
      result << ee
    end
    # check if we have entry-values not yet in the list
    # (e.g. from another topic when the entry was moved, and add those too)
    self.entry_values.each do |entry_value|
      if result.select{|ee| ee.id == entry_value.id}.count == 0
        result << entry_value
      end
    end
   
    result
  end

So what happens if we switch back to the old valid_entry_values : how does that change performance?

I ran my small benchmark script again, and got the following results:

    normal   0.015264   0.005280   0.020544 ( 33.283069)
    eager    0.009901   0.004359   0.014260 ( 17.145350)
    include  0.013153   0.004032   0.017185 ( 17.621856)

Wow! Now the eager_load or includes really seem to pay off. Also: almost the same speed improvement. Ok.

So if we check the entry_values_with_empty more closely, the implementation is somewhat naive: for each dynamic-attribute it will attempt to find the corresponding entry-value, except ... we use a query each time for each dynamic attribute, for each entry ... Mmmmmm. Let's see if we can improve this:

  def entry_values_with_empty
    result = []
    self.topic.dynamic_attributes.each do |da|
      ee = entry_values.detect{|ev| ev.dynamic_attribute_id == da.id}
      if ee.nil? || (da.attribute_type == 'item' && ee.item_content_type.nil?)
        ee = entry_values.build(dynamic_attribute: da)
      end
      result << ee
    end
    # check if we have entry-values not yet in the list
    # (e.g. from another topic when the entry was moved, and add those too)
    self.entry_values.each do |entry_value|
      if result.select{|ee| ee.id == entry_value.id}.count == 0
        result << entry_value
      end
    end

    result
  end

Notice: we only changed one line, replacing the find_by with a detect. This will, instead of launching a new query, iterate over the already retrieved array of entry_values. But does this make any difference?

Launching my small test script now returns the following:

    normal   0.016051   0.005418   0.021469 ( 16.448649)
    eager    0.009454   0.003858   0.013312 ( 22.479142)
    include  0.012419   0.003872   0.016291 ( 14.236868)

NICE! **fireworks** Not what I expected to see at all. A little baffled that the normal case is improved that much, and that the eager_load does not improve it (on the contrary). We have now found our optimal combination: improving the entry_values_with_empty and adding the includes will give the best performance.

Is this what you would have expected? Bottomline remains: it helps to measure (in Dutch we say: meten is weten which rhymes)

More ...
Technology ruby on rails rspec
[RSPEC] Cleaning up orphaned attachments when running specs

So when running the specs we also create a lot of fake attachments, but they are never cleaned up. Which is probably obvious, because we never actually destroy the models (containing the attachments), but truncate the database or rollback the transactions.

So I tried to find a way to 1) automatically/more easily clean up those dummy attachments, and 2) make sure it works when using parallel:specs. And over my different projects, where in some I use different gems to manage my attachments.

In one project, I am using paperclip and there I took the following approach. In the initializer config/initializers/paperclip.rb I wrote

  Paperclip::UriAdapter.register
  if Rails.env.production?
    Paperclip::Attachment.default_options.merge!(
      hash_secret: ENV.fetch('SECRET_KEY_BASE'),
      s3_protocol: :https,
      url: ':s3_domain_url',
      path: "/:class/:attachment/:id/:style/:hash.:extension",
      storage: :s3,
      s3_credentials: { .. }
    )
  elsif Rails.env.development?
    Paperclip::Attachment.default_options.merge!(
      url: "/system/:class/:attachment/:id/:style/:hash.:extension",
      hash_secret: Rails.application.credentials.secret_key_base
    )
  elsif Rails.env.test? || Rails.env.cucumber?
    Paperclip::Attachment.default_options.merge!(
      url: "/spec_#{ENV['TEST_ENV_NUMBER']}/:class/:attachment/:id/:style/:hash.:extension",
      hash_secret: Rails.application.credentials.secret_key_base
    )
  end

and then in rspec rails_helper.rb I can add the following piece of code

  config.after(:suite) do
    FileUtils.remove_dir(File.join(Rails.root, 'public', "spec_#{ENV['TEST_ENV_NUMBER']}"), true)
  end

In another projects I am using carrier_wave and there it is a little more difficult, but it amounts to the same approach. In CarrierWave we create different uploaders, and each have their own configuration. In my project I first iterate over all uploaders in my own code-base, and explicitly require one uploader from our own shared gem (between different projects). So we add an initializer config/carrierwave_clean_spec_attachments.rb (or whatever name you prefer) to override the path when in test mode:

if Rails.env.test? || Rails.env.cucumber?
  Dir["#{Rails.root}/app/uploaders/*.rb"].each { |file| require file }
  require 'document_uploader'

  CarrierWave::Uploader::Base.descendants.each do |klass|
    next if klass.anonymous?
    klass.class_eval do
      def cache_dir
        "#{Rails.root}/spec/support/uploads_#{ENV['TEST_ENV_NUMBER']}/tmp"
      end

      def store_dir
        "#{Rails.root}/spec/support/uploads_#{ENV['TEST_ENV_NUMBER']}/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
      end
    end
  end
end

and then in my rails_helper.rb I can then add the following statement:

config.after(:suite) do
  FileUtils.rm_rf(Dir["#{Rails.root}/spec/support/uploads_#{ENV['TEST_ENV_NUMBER']}"])
end

How do you do this? Do you use another gem for storage/attachments and how do you solve it? E.g. when using ActiveStorage ?

More ...
N/A
[rgeo] unable to convert multilinestring with zm geometry to geojson

I encountered a very specific issue when trying to convert the geometries read from shapefiles to geojson, and for some reason this failed.

I had some very simple code to inspect the shapefiles and see what I should do with them. In that process I also wanted to convert the geometries to geojson (for test), as that is the actual goal for me.

So my code iterates over a folder of shape-files, and then just does an inspection of the first element. This should give me an idea of what all the files contain. The code looked like this

Dir[File.join(root_folder, '*.shp')].sort.each do |shapefile|
  puts shapefile

  RGeo::Shapefile::Reader.open(shapefile) do |file|
    puts "File contains #{file.num_records} records."
    record = file.next
    puts "First record geometry WKT  : #{record.geometry.as_text}"
    puts "             coordinates   : #{record.geometry.coordinates}"
    puts "             geometry JSON : #{RGeo::GeoJSON.encode(record.geometry)}"
    puts "             Attributes    : #{record.attributes.inspect}"
    puts 
  end
end

I got the weirdest error when trying to run this code

 gems/rgeo-0.6.0/lib/rgeo/geos/zm_feature_methods.rb:305:in `block in each': no block given (yield) (LocalJumpError)

Apparently one of the shjapefiles contained a MultiLinestring with a z and m coordinate. All zero, so whyyyyy ? But still: rgeo should be able to handle that?

I tracked the code in rgeo and found the following culprit in Rgeo::Geos::ZMMultiLineStringMethods

 each.map(&:coordinates)

Ooooops. Now how could i fix this? I am currently working on an ancient version of rails, and thus also rgeo. I could open an issue to fix it, but still I would not be able to update my own version (mostly because of the activerecord-postgis-adapter).

But, thankfully, we are using ruby and we can hotfix code (reopen the class and fix the bug!). So I added an initializer in config\initializers\fix_rgeo_bug.rb with the following code

module RGeo
  module Geos
    module ZMMultiLineStringMethods # :nodoc:

      # overwrite to fix!
      def coordinates
        puts "COOOORDINATES"
        coords = []
        each do |gm|
          coords << gm.coordinates
        end
        coords 
      end

    end
  end
 end

... and now my code is running smoothly!!

So awesome it is possible in ruby to reopen classes. And that rails has a well controlled loading system and an entry-point before execution to place my own initializers. I have used this a few times before, mainly to fix outdated gems without having to update them, or add very specific behaviour. It is the combination of having readable code, open source code, and then re-opening classes to add my own behaviour or fix bugs. I am still thankful/happy every day to be working in ruby on rails :clap: :clap: :clap:

I wonder if there are other programming languages or frameworks where this is possible?

Now open an issue on rgeo to address this bug :)

More ...
News
Unable to load the EventMachine C extension; To use the pure-ruby reactor, require 'em/pure_ruby'

So, unfortunately we have to deploy our rails projects on servers which are managed by our clients, and so this means those are windows servers. Luckily this no longer is a big deal, but I develop on mac and mostly deploy on linux machines (which align). But a new deployment on windows almost always adds some surprises. So we deploy using ruby 2.4 and somewhere in our Gemfile we use eventmachine and on the most recent deployment I suddenly got this weird error:

Unable to load the EventMachine C extension; To use the pure-ruby reactor, require 'em/pure_ruby'

Not sure what they mean here: do I need to adapt the gem-code???? But luckily some googling quickly turned up a solution. Apparently the eventmachine gem is not updated correctly to use ruby 2.4 or 2.5 and the proposed solution is to do

gem uninstall eventmachine  
gem install eventmachine --platform=ruby

instead. This sounds great. In theory. But in practice? I have a bundle Gemfile and after every deploy/bundle I will have to uninstall the eventmachine-1.2.7-x64-mswin32 gem. I do have a script that I run on windows to deploy, and so I could easily add

gem uninstall -aIx eventmachine 
gem install eventmachine --platform=ruby

(the -aIx will remove all eventmachine instances and not care about dependencies)
but this feels a little counter-productive (wrong?) and it did not always seem to work reliably.

So I was looking for ways to describe in my Gemfile how to install the gem with the correct platform. Unfortunately platform has a different meaning inside a Gemfile, and the ruby platform is anything but windows.

But then I had an inspirational moment, why not install the gem from github, in the correct version?

So in my Gemfile I wrote

gem 'eventmachine', '1.2.7', git: 'git@github.com:eventmachine/eventmachine', tag: 'v1.2.7'

Installing the required version directly from git, which does work and does not break my deployment script/routine.

More ...
News
[wice-grid] solving error with losing filters upon paginating

We encountered this strange error using WiceGrid: on some occasions when paginating to the second page, we actually lost the filtering, but not for all columns.

WiceGrid offers to define columns which are only rendered when creating html or exporting to csv. For us specifically, in some cases we want to show some pretty html when rendering html but just show the text when rendering/exporting to csv. For instance:

g.column name: 'Status', attribute: 'status', in_csv: false do |plan|
    render 'grid_status_label', plan_request: plan, history: true
  end
  g.column name: 'Status', attribute: 'status', in_html: false

When rendering html, it will render a partial called grid_status_label, when rendering csv it will just show the status-text.

However, when defining the same column twice, this also has an effect on the filter. Either because we "exclude" one of definitions the column or because the column is defined twice, I am not sure. The easy way would be to know if we are rendering csv before defining the column so we don't define it twice at all and not confuse WiceGrid.

Luckily, we can ask the @grid if it is outputting csv. So if in your controller you write something like

@grid = initialize_grid(SomethingWithAStatus, ...)

in the view you can just ask @grid.output_csv? to know if we are currently exporting to csv instead of html.

So with that knowledge, in your view you can write

<%= grid(@grid) do |g|  
       [.. your other columns ..]            

       g.column name: 'Status', attribute: 'status', in_csv: false do |plan|
         render 'grid_status_label', plan_request: plan, history: true
       end
       if @grid.output_csv?
         g.column name: 'Status', attribute: 'status', in_html: false
       end
     end -%>

... and pagination while filtering on status will work!!

I really love(d) using WiceGrid but unfortunately it is no longer maintained actively. There is a somewhat active branch, but it only works for rails 5 and not entirely sure what the status is there. So this is at least a fix so we can keep using WiceGrid in our current projects for now.

Not quite sure how I would like to proceed with WiceGrid, because the code-base is really large and there are some things I do not really like (e.g. having to use erb, the dsl is sometimes a bit heavy, there is no test-coverage --there is a separate test-project but mmmm, the layout is pretty much fixed). But on the other hand it has proven extremely easy and robust and extensible (define your own column-filter and render types). I will probably try to fork or restart with something similar.

More ...
News
[rails] styling on_the_spot with bootstrap v3/v4

The on_the_spot gem allows inline editing of data. In general this is something I prefer over forms: I do not want to switch to a new page to edit something, I want to edit it where I see it (I understand there are some very good cases for the standard show/edit pages).

So a very long while ago I created a small gem to edit data inline. It relies on the jEditable javascript, which is still working.

But how do you style the dynamically injected form?

In my projects, I use the translation files as follows, e.g. in on_the_spot.en.yml I write :

en:
  on_the_spot:
    ok: <button class="btn btn-primary btn-sm">Ok</button>
    cancel: <button class="btn btn-default btn-sm">Cancel</button>
    tooltip: Click to edit
    access_not_allowed: Access not allowed

This will make sure the buttons are styled correctly. But if you try this, the input is too narrow, and everything is just squished together.

So add this little sprinkle of css to make everything look a little better:

.on_the_spot_editing {
  input, select {
    width: auto !important;
    height: 30px !important;

    margin-right: 5px !important;

    //display: block;

    padding: 6px 12px;
    font-size: 14px;
    line-height: 1.42857143;
    color: #555555;
    background-color: #fff;
    background-image: none;
    border: 1px solid #ccc;
    border-radius: 4px;

    -webkit-box-shadow: inset 0 1px 1px rgba(0, 0, 0, 0.075);
    box-shadow: inset 0 1px 1px rgba(0, 0, 0, 0.075);
    -webkit-transition: border-color ease-in-out 0.15s, box-shadow ease-in-out 0.15s;
    -o-transition: border-color ease-in-out 0.15s, box-shadow ease-in-out 0.15s;
    transition: border-color ease-in-out 0.15s, box-shadow ease-in-out 0.15s;
  }
  textarea {
    width: 80%;
  }

  .btn {
    margin: 1px !important;
  }
}

What inline editing solution are you using with rails?

I am currently contemplating to switch over to start using vue.js for javascript sprinkles like this.

More ...

Create

show some portfolio and link to