Tunings and tweaks

This page has been created to track changes to the underlying systems and the default configurations needed in order to enable greater scalability for OpenStack. The goal here is to share the knowledge until the out of the box defaults can be changed. If that won't happen, then many of the items here should probably get put into end user documentation.

General

You will hit OS system limits as you scale (this is a big hammer, will need some tightening for security) Increase number of open files on the messing and database services (EL6) edit /etc/security/limits.conf:

     mysql    soft nofile 16384
     mysql   hard nofile 16384
     qpidd   soft nofile 16384
     qpidd   hard nofile 16384
     rabbitmq   soft nofile 16384
     rabbitmq  hard nofile 16384
     postgres soft nofile 16384
     postgres hard nofile 16384
     Increase number of procs
     /etc/security/limits.d/90-nproc.conf
     *          soft    nproc     10240
     root       soft    nproc     unlimited

Systemd (EL7) uses per service limits and does not uses /etc/security/limits.conf . One possible way for setting a per service limit is creating a new service target in the /etc/systemd/system/ which will inherit and override from the packaged version of the target.

/etc/systemd/system/rabbitmq-server.service:

   .include /lib/systemd/system/rabbitmq-server.service
   [Service]
    LimitNOFILE=16384

Instead of creating an overlapping service, just extending it is also possible:

  /etc/systemd/system/mariadb.service.d/limits.conf
  [Service]
  LimitNOFILE = 16384

If the service already enabled and the service link points to the packaged version of the script, you need to disable and enable the service:

   systemctl  stop rabbitmq-server
   systemctl disable rabbitmq-server
   systemctl enable rabbitmq-server
   systemctl  start rabbitmq-server

The systemd config files need to be reload after configuration changes:

   systemctl daemon-reload

HAProxy requires to configure the file descriptor limits inside it's config file. http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#3.2-maxconn

By default lots of stuff seems to go to /var, if possible put /var on a bigger, faster drive Try using SSD if available

Neutron

      to increase the number of networks and similar components you can run this from the command line

     where tenant-id is the uuid of the openstack tenant running the test

     neutron quota-update  --tenant-id  $admin  --network -1 --subnet  -1 --port  -1 --router -1 --floatingip -1 --security_group -1 --security_group_rule -1
     # add --vip -1 --pool -1  if needed
    The -1 means unlimited.

     Use jumbo frames for interfaces carrying GRE/VXLAN traffic:
     Compute node(s): `      echo MTU=`<MTU>` >> /etc/sysconfig/network-scripts/ifcfg-`<interface>
        network_device_mtu=`<Guest MTU>` (50b less than tunnel interface for vxaln, 28b less for gre ) in the nova.conf file
     Network node(s): `      echo MTU=`<MTU>` >> /etc/sysconfig/network-scripts/ifcfg-`<interface>
        echo dnsmasq_config_file=/etc/neutron/dnsmasq-neutron.conf >> /etc/neutron/dhcp_agent.ini
        echo dhcp-option-force=26,`<MTU>` >> /etc/neutron/dnsmasq-neutron.conf
     Disable secure rootwrap:
     Change root_helper in /etc/neutron/neutron.conf and in all used neutron ini file.
         [agent]
         root_helper = sudo
     Edit the sudores file to allow neutron to use sudo without password for the commands required by neutron.
     It makes the command execution faster, but without filtering it is less secure.
     Just by the sudoers files you cannot property filter an evil command patters like this: ip netns exec net-ns evil_command.

Compute node

     Ensure tuned is in place

     tuned-adm list

\1

     There is a limit to how many rows get returned from queries

     /etc/nova/nova.conf

     osapi_max_limit=10000

     Increase defaults for mysql

        increase number of connections -

in /etc/my.cnf

     max_connections = 15360
     innodb_buffer_pool_size = 10G
     innodb_flush_method = O_DIRECT
     innodb_file_per_table
     innodb_flush_log_at_trx_commit = 0
     innodb_log_file_size=1500M
     innodb_log_files_in_group=2

Consider using thread_handling=pool-of-threads option when mariadb needs to handle large number of not too active connection. Mariadb by default creates a thread for every connection, which consumes a significant amount a memory when it needs to handle thousands of connections.

Avoid defining charset=utf8 without the use_unicode=0 in the mysql connection strings. http://docs.sqlalchemy.org/en/rel_0_9/dialects/mysql.html#unicode

    rabbitmq:
    if your erlang version support the hipe compile you can enable it in   /etc/rabbitmq/rabbitmq.config. `  `[`http://www.fpaste.org/125147/40791409`](http://www.fpaste.org/125147/40791409)

Keystone

     Reduce default token duration on Keystone from 1 day  (86400) to 1 hour (3600)

     Set expiration to 3600 in [token]  section in keystone.conf file on controller.

     Update revocation_cache_time from 1 to 300 with auth_token middleware. Till this is not changed in the code one need to update the each service specific file like glance-api.conf and add revocation_cache_time=300  in [keystone_authtoken] section.

    Make sure you have crontab entry for '/usr/bin/keystone-manage token_flush' and it run on at least on one server at least once in every hour.

Swift

     swift parameters missing in quickstack params.pp.  As such a foreman based packstack + quickstack deployment initial puppet runs fail to setup a cluster appropriately without adding to /usr/share/openstack-foreman-installer/puppet/modules/quickstack/manifests/params.pp:

     $swift_admin_password

     $swift_shared_secret