develop with

Add Character Encoding to Heroku buildpack

Using charlock homes library in your heroku application

In this tutorial, we’ll add the support for the charlock_holmes library to a heroku build pack. The icu4c is a native extension that is used for processing character encoding in ruby.

Setup: Take a look at [setting up a custom build pack](/how-do-i/ruby/setup-custom-ruby-buildpack.html) article.

First, create an S3 bucket to store the binary library for use in your build pack. Let’s call it my-heroku-binaries. In the lib/language_pack/base.rb, add a variable for your binary bucket.

MY_BINARY_URL = 'https://s3.amazonaws.com/my-heroku-binaries'

Build the binary for ICU4C using vulcan. See Ryan Daigle’s tutorial on building with vulcan. Download the binary and put it in your bucket.

Building via vulcan:

vulcan build -v -s icu/source -p /app/vendor/icu4c-49.1.2 \
  -c "./configure --prefix=/app/vendor/icu4c-49.1.2 --disable-samples --disable-tests && make && make install"

In the compile method in lib/language_pack/ruby.rb add a call to the install_icu4c method. The library will eventually be installed into /app/vendor/icu4c-49.1.2.

def compile
    Dir.chdir(build_path)
    remove_vendor_bundle
    install_ruby
    install_jvm
    setup_language_pack_environment
    setup_profiled
    install_icu4c
    allow_git do
      install_language_pack_gems
      build_bundler
      create_database_yml
      install_binaries
      run_assets_precompile_rake_task
    end
    super
  end

Also update the setup_language_pack_environment and the setup_profiled methods in the lib/language_pack/ruby.rb to reference the libraries for compilation by updating the LD_LIBRARY_PATH.

 # sets up the environment variables for the build process
  def setup_language_pack_environment
    setup_ruby_install_env

    config_vars = default_config_vars.each do |key, value|
      ENV[key] ||= value
    end
    ENV["GEM_HOME"] = slug_vendor_base
    ENV["PATH"]     = "#{ruby_install_binstub_path}:#{slug_vendor_base}/bin:#{config_vars["PATH"]}"
    ENV["LD_LIBRARY_PATH"] = "vendor/#{ICU4C_VENDOR_PATH}/lib"
  end

  # sets up the profile.d script for this buildpack
  def setup_profiled
    set_env_override "GEM_PATH", "$HOME/#{slug_vendor_base}:$GEM_PATH"
    set_env_default  "LANG",     "en_US.UTF-8"
    set_env_override "PATH",     "$HOME/bin:$HOME/#{slug_vendor_base}/bin:$PATH"
    set_env_default  "LD_LIBRARY_PATH", "/app/vendor/#{ICU4C_VENDOR_PATH}/lib"

    if ruby_version_jruby?
      set_env_default "JAVA_OPTS", default_java_opts
      set_env_default "JRUBY_OPTS", default_jruby_opts
      set_env_default "JAVA_TOOL_OPTIONS", default_java_tool_options
    end
  end

Update the build_bundler and change the env_vars within the Dir.mktmpdir("libyaml-") do |tmpdir| loop to include CHARLOCK_HOMES reference to icu lib:

env_vars       = "env BUNDLE_GEMFILE=#{pwd}/Gemfile BUNDLE_CONFIG=#{pwd}/.bundle/config BUNDLE_BUILD__CHARLOCK_HOLMES=\"--with-icu-dir=#{pwd}/vendor/#{ICU4C_VENDOR_PATH}\" CPATH=#{yaml_include}:$CPATH CPPATH=#{yaml_include}:$CPPATH LIBRARY_PATH=#{yaml_lib}:$LIBRARY_PATH RUBYOPT=\"#{syck_hack}\""

Create the method below:

def install_icu4c
  dir = File.join('vendor', ICU4C_VENDOR_PATH)
  FileUtils.mkdir_p dir
  Dir.chdir(dir) do
    run("curl #{MY_BINARY_URL}/icu4c-#{ICU4C_VERSION}.tar.gz -s -o - | tar xzf -")
  end
end

Add charlock_holmes to your Gemfile and you are all good to go.

comments powered by Disqus

Want to see a topic covered? create a suggestion

Get more developer references and books in the developwith store.