cookbook 'dse', '= 3.0.15', :supermarket
dse (11) Versions 3.0.15 Follow6
Installs/Configures Datastax Enterprise.
cookbook 'dse', '= 3.0.15'
knife cookbook site install dse
knife cookbook site download dse
Datastax Enterprise Chef Cookbook (Apache Cassandra)
This cookbook installs and configures Datastax Enterprise. More info is here (DataStax Enterprise).
It uses officially released Datastax packages. It can tweak the Cassandra config files, but has no way of adding data or creating keyspaces in Cassandra (yet).
Usage
This cookbook is designed to be used in conjuction with a wrapper cookbook. Used alone, a single node cluster can be created, but in order to create a multiple node cluster a wrapper is recommended.
Example in a wrapper:
node.default['java']['jdk_version'] = "7"
node.default['cassandra']['seeds'] = "192.168.1.1, 192.168.1.2"
node.default['cassandra']['dse_version'] = "4.0.3-1"
node.default['cassandra']['max_heap_size'] = "12G"
node.default['cassandra']['heap_newsize'] = "1200M"
include_recipe "dse::cassandra"
Scope
This cookbook attempts to manage almost all Apache Cassandra configuration settings. It can also create Hadoop and Solr nodes, with less attribute to manage their config.
Apache Cassandra
This cookbook currently provides
- Datastax 4.x.x (Datastax Enterprise Edition) via packages.
Requirements
- Chef 11 or higher
Supported OS Distributions
Tested on:
- RHEL 6.3, 6.4
- Ubuntu 14.04.1 LTS
- Slight testing done on Ubuntu 12.04 (will require some edits)
Recipes
The provided recipes are dse::cassandra
, dse::solr
, and dse::hadoop
* dse::cassandra
will provision DSE as a cassandra node.
* dse::solr
will provision DSE with solr enabled.
* dse::hadoop
will provision DSE with hadoop enabled.
There are also recipes that should not be called directly that are used for configuration.
* dse::default
sets up the templates
* dse::datastax
sets up the datastax repos
* dse::datstax-agent
configures the datastax-agent if needed
* dse::ssl
(work in progress) sets up SSL keys on all nodes
Attributes
This cookbook will install DSE Cassandra by default. Other attributes you can set are:
default.rb
overall settings
node["cassandra"]["cluster_name"]
(default:Test Cluster
): The name of the cluster to provisionnode["cassandra"]["vnodes"]
(default:true
): enable or disable vnodesnode["cassandra"]["intial_token"]
(default:nil
): the initial token to use. leave blank for vnodesnode["cassandra"]["num_tokens"]
(default:256
): set the number of tokens to usenode["cassandra"]["solr"]
(default:false
): enable solr or notnode["cassandra"]["hadoop"]
(default:false
): enable hadoop or notnode["cassandra"]["dse_version"]
(default:4.0.3-1
): dse version to installnode["cassandra"]["user"]
(default:cassandra
): the cassandra usernode["cassandra"]["group"]
(default:cassandra
): the cassandra group
cassandra.yaml settings
node["cassandra"]["listen_address"]
(default:node['ipaddress']
): the ipaddress to use for listen addressnode["cassandra"]["rpc_address"]
(default:node['ipaddress']
): the ipaddress to use for rpc addressnode["cassandra"]["broadcast_address"]
(default:nil
): the ipaddress to use for broadcast addressnode["cassandra"]["seeds"]
(default:node['ipaddress']
): the ipaddress to use for the seed listnode["cassandra"]["concurrent_reads"]
(default:32
): concurrent reads settingnode["cassandra"]["concurrent_writes"]
(default:32
): concurrent writes settingnode["cassandra"]["compaction_thruput"]
(default:16
): limit the throughput of compactionsnode["cassandra"]["multithreaded_compaction"]
(default:false
): enable or disable multithreaded compactionnode["cassandra"]["in_memory_compaction_limit"]
(default:64
): size limit for in-memory compactionsnode["cassandra"]["trickle_fsync"]
(default:false
): enable trickle fsync, usually for ssdnode["cassandra"]["range_request_timeout_in_ms"]
(default:10000
): default timeout on range requestsnode["cassandra"]["thrift_framed_transport_size_in_mb"]
(default:15
): the max size of a thrift framenode["cassandra"]["thrift_max_message_length_in_mb"]
(default:nil
): the max message length of a thrift callnode["cassandra"]["concurrent_compactors"]
(default:nil
): the number of concurrent compactors to allow
Role based seed selection
node["cassandra"]["role_based_seeds"]
(default:false
): set to true to assign seeds based on members of dse-seed rolenode['cassandra']['seed_role']
(default:role:dse-seed
): set to a diffrent role to select seeds
gc settings
node["cassandra"]["CMSInitiatingOccupancyFraction"]
(default:65
): cms occupancy fraction to use for gcnode["cassandra"]["max_heap_size"]
(default:8192M
): default max heap size for cassandranode["cassandra"]["heap_newsize"]
(default:800M
): default new gen size for heap
authentication settings
node["cassandra"]["authentication"]
(default:false
): enable or disable authenticationnode["cassandra"]["authorization"]
(default:false
): enable or disable authorizationnode["cassandra"]["authenticator"]
(default: ``): the authenticator to use (eg org.apache.cassandra.auth.AllowAllAuthenticator)node["cassandra"]["authorizor"]
(default: ``): the authorizor to use (eg org.apache.cassandra.auth.AllowAllAuthorizer)
audit logs
node["cassandra"]["log_level"]
(default:INFO
): the log level for cassandra (or solr/hadoop)node["cassandra"]["audit_logging"]
(default:false
): turn on audit loggingnode["cassandra"]["audit_dir"]
(default:/var/log/cassandra
): the directory to put audit logs innode["cassandra"]["active_categories"]
(default:ADMIN,AUTH,DDL,DCL
): the categories to audit on
dse.rb
node["cassandra"]["dse"]["delegated_snitch"]
(default:org.apache.cassandra.locator.SimpleSnitch
): the snitch to use for dsenode["cassandra"]["dse"]["snitch"]
(default:com.datastax.bdp.snitch.DseDelegateSnitch
): the snitch to use in dse.yamlnode["cassandra"]["dse"]["service_name"]
(default:dse
): the name of the servicenode["cassandra"]["dse"]["conf_dir"]
(default:/etc/dse
): the directory of dse config filesnode["cassandra"]["dse"]["repo_user"]
(default: ``): the datastax username for the reponode["cassandra"]["dse"]["repo_pass"]
(default: ``): the datastax password for the reponode["cassandra"]["dse"]["rhel_repo_url"]
(default:http://#{node['cassandra']['dse']['repo_user']}:#{node['cassandra']['dse']['repo_pass']}@rpm.datastax.com/enterprise
): the rhel reponode["cassandra"]["dse"]["debian_repo_url"]
(default:http://#{node['cassandra']['dse']['repo_user']}:#{node['cassandra']['dse']['repo_pass']}@debian.datastax.com/enterprise
): the debian repo
hadoop.rb
node["hadoop"]["max_heap_size"]
(default:10G
): the heap size for hadoopnode["hadoop"]["heap_newsize"]
(default:800M
): the heap newgen size for hadoopnode["hadoop"]["map_child_java_opts"]
(default:4G
): the size of the map child java heapnode["hadoop"]["reduce_child_java_opts"]
(default:4G
): the size of the reduce child java heapnode["hadoop"]["map_red_localdir"]
(default:/data/mapredlocal
): the directory to use for map/reducenode["hive"]["scratch_dir"]
(default:/data/hive
): the directory to use for hivenode["hadoop"]["map_reduce_parallel_copies"]
(default:20
): the number of map reduce copiesnode["hadoop"]["mapred_tasktracker_map_tasks_max"]
(default:23
): the max number of map tasksnode["hadoop"]["mapred_tasktracker_reduce_tasks_max"]
(default:12
): the max number of reduce tasksnode["hadoop"]["io_sort_mb"]
(default:512M
): the size of iosortnode["hadoop"]["io_sort_factor"]
(default:64
): the iosort factor
solr.rb
node["solr"]["max_heap_size"]
(default:14G
): the heap size for solrnode["solr"]["heap_newsize"]
(default:2400M
): the newgen heap size
java.rb
These are generic java settings. Datastax recommends oracle java, so override openjdk default and download from a specific location.
* node["dse"]["manage_java"]
(default: true
): whether or not to use the java recipe to manage the java install
* node["java"]["install_flavor"]
(default: oracle
): the flavor of java to install
* node["java"]["jdk_version"]
(default: 7
): the version of java to use
* node['java']['jdk']['7']['x86_64']['url']
(default: ``): the url to get the java 7 file from
ssl.rb
This portion is under construction. SSL does not currently 100% work.
* node["cassandra"]["dse"]["cassandra_ssl_dir"]
(default: /etc/cassandra
): the directory to use for pem files
* node["cassandra"]["dse"]["password_file"]
(default: cassandra_pass.txt
): the file to store the keystore pass in
* node["cassandra"]["dse"]["internode_encyption"]
(default: none
): the encyption to use (all, dc, rack)
* node["cassandra"]["dse"]["keystore"]
(default: #{node["cassandra"]["dse"]["cassandra_ssl_dir"]}/#{node["hostname"]}.keystore
): keystore name
* node["cassandra"]["dse"]["truststore"]
(default: #{node["cassandra"]["dse"]["cassandra_ssl_dir"]}/#{node["hostname"]}.truststore
): truststore name
datastax-agent.rb
These attributes are used to conigure the datastax-agent. This is used with Datastax Opscenter.
node["datastax-agent"]["enabled"]
(default:false
): whether to install the datastax agent and configurenode["datastax-agent"]["version"]
(default:4.1.1-1
): the version of the datastax agent to installnode["datastax-agent"]["conf_dir"]
(default:/var/lib/datastax-agent/conf
): where the datastax-agent conf file isnode["datastax-agent"]["opscenter_ip"]
(default:192.168.32.3
): the Opscenter IP to connect to
Dependencies
- java
- yum
- apt
Datastax recommends to use the Oracle jdk version. You can do this by setting an attribute in your environment or run list.
Currently, Oracle prevents you from downloading the package from their website, put it in Artifactory or something as a workaround. You can override the java url with an attribute, show below.
Copyright & License
- Author: Daniel Parker (daniel.c.parker@target.com)
- Reviewer: Eric Helgeson (erichelgeson@gmail.com)
Released under the Apache 2.0 License.
Dependent cookbooks
apt ~> 2.0 |
yum-epel ~> 0.6 |
yum ~> 3.5 |
java ~> 1.14 |
Contingent cookbooks
There are no cookbooks that are contingent upon this one.
CHANGELOG for dse cookbook
This file is used to list changes made in each version of the dse cookbook.
3.0.14
- restart the datastax-agent on new version
3.0.13
- added support to set MaxTenuringThreshold in cassandra-env
3.0.12
- refactored some version checks
- added role based seed assignment
3.0.11
- add templates and upgrade to 4.5.2
3.0.10
- add templates and upgrade to 4.0.4
3.0.9
- remove the ssd tuning from this recipe
3.0.8
- minor updates
3.0.7
- adding ability to tune memtable thresholds
3.0.6
- removed specific tuning to move it to os-tuning cookbook
- adding ability to set concurrent compactors
3.0.5
- added thrift frame size settings for hadoop requests
3.0.4
*adding recommended datastax tuning settings
3.0.3
- adding cassandra specific gc settings
3.0.2
- more hadoop tunung
3.0.1
- adding hadoop tuning settings
3.0.0
- First pass of node-to-node ssl
- Adding another hadoop attribute
- fixing hadoop map reduce dir, as hadoop didnt create
- adding changes to ssl
- adding a version-specific dse script
- adding support to stop dse before an upgrade
- hive scratch directory support
2.3.5
- subscribed the dse service to java, so it will restart if java version changes
- added more chefspec
- moved the start of the dse service until after all the templates are set up
2.3.4
- allow support for gossipingPropertyFileSnitch
2.3.3
- Tell the OS that SSDs are present
2.3.2
- Allow Solr and Hadoop Heap to be set dynamically
2.3.1
- Allows this recipe to install the datastax-agent
2.3.0
- Allow DSE 4.0 to be installed
2.2.0
- Rename the cookbook to dse
2.1.3
- Added kitchen tests
- Added multiple data directory support
0.1.0:
- Initial release of cassandra
Check the Markdown Syntax Guide for help with Markdown.
The Github Flavored Markdown page describes the differences between markdown on github and standard markdown.