Category: Random

Elasticsearch, updating your mapping

Elasticsearch, updating your mapping

A very common issue, while working with elasticsearch and refining the data model of a project, is to face some error because the fields are generic text or fail to aggregate because fielddata is not enabled:

Fielddata is disabled on text fields by default. Set fielddata=true on [your_field_name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

First of all, DO NOT enable fielddata, think about that twice and say “NO”. There is always a better way, they cost a lot. Read this for details: https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html

Retrieve the current mapping.

This is the first step, you have to understand what is wrong and how you intend to fix it. use

curl -XGET "http://host:9200/indexname/_mapping"

to retrieve the current mapping of your index.

Create or update the mapping for your index.

First of all, have a look at this URL: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html

It will help to clarify which types you can use / specify in your mapping. It is more likely that you will need to add ip, dates and long, especially if you’re going to deal with nested data.

By generally speaking the template you’re going to apply will look like below

{
  "order": 1,
  "settings": {
    ... this is where you specify any settings ...
  },
  "index_patterns": ["yourpattern"]
  "mappings": {
    .. this can be exactly what you get from the _mapping API of the index ...

You may want to do that because some fields are wrongly mapped with a wrong type (e.g. number are currently mapped as text or IP addresses are mapped as text).

Once you have done, you can just invoke the _template API and set/update:

curl -H 'Content-Type: application/json' http://localhost:9200/_template/yourpattern -d@/path/to/yourpattern.template.json

the template that will be used for the next index that will be created, so if you specify

"index_patterns": ["yourpattern-*"]

any new index matching that pattern will inherit the mapping (e.g. yourpattern-2019.01.15)

Update an existing index

The above template won’t affect the already create index, so you might need to “migrate” the existing data to be mapped according to the new template. You can do that by telling elasticsearch to create a new index with the _reindex API.

Below example assume your existing data are stored with index yourpattern-1-2019.01.15. Before proceeding, make sure nothing is writing your index (it could be a good idea to set a version number to the index so you can route the traffic to a new index by just telling the clients, like logstash, to use a different number), then you can run the below curl

curl -XPOST "http://127.0.0.1:9200/_reindex?wait_for_completion=false" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "yourpattern-1-2019.01.15"
  },
  "dest": {
    "index": "yourpattern-1.1-2019.01.15"
  }
}'

By specifying ?wait_for_completion=false elasticsearch will return a token that can be used to monitor the process with the _tasks API

curl -XGET "http://127.0.0.1:19200/_tasks/MM-HD_-cRXeOF6QESzidSQ:1155556"

The Reindex API is an incredibly powerful tool, I recommend to read this for more details: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

Once you have done, you can safely remove the previous index with the wrongly mapped data.

CentOS 7: Disabling OOMKiller for a process

site unreliability

In the latest version of the proc filesystem the OOMKiller has had some adjustments.  The valid range is now -1000 to +1000; previously it was -16 to +15 with a special score of -17 to outright disable it.  It also now uses /proc/<pid>/oom_score_adj instead of /proc/<pid>/oom_adj.  You can read the finer details here.

Given that, systemd now includes OOMScoreAdjust specifically for altering this.  To fully disable OOMKiller on a service simply add OOMScoreAdjust=-1000 directly underneath a [Service] definition, as follows.

...
[Service]
OOMScoreAdjust=-1000
...

This score can be adjusted if you want to ensure the parent PID lives, but children processes can be safely reaped by setting it to something like -999, then if “/bin/parent”, has “/bin/parent –memory-hungry-child,” it will be killed first.

If you have a third-party daemon (like Datadog, used in this example below) which manages itself and uses a sysvinit script you can still calm…

View original post 173 more words

How to fix not working targetcli/target on CentOS 7.1.1503

If you’re making practice on CentOS (as I do) for your RHCE/RHCSA exam you might be installing and working on a couple of VMs running CentOS 7.1 (even though 7.0 should be the reference).

You may have noticed that on that release targetcli tool and its service (target.service) are not working because of a python runtime error:

Installed:
 targetcli.noarch 0:2.1.fb41-3.el7

Dependency Installed:
 pyparsing.noarch 0:1.5.6-9.el7 python-configshell.noarch 1:1.1.fb18-1.el7 python-kmod.x86_64 0:0.9-4.el7
 python-rtslib.noarch 0:2.1.fb57-3.el7 python-urwid.x86_64 0:1.1.1-3.el7

Complete!
[root@server ~]# targetcli
Traceback (most recent call last):
 File "/bin/targetcli", line 24, in <module>
 from targetcli import UIRoot
 File "/usr/lib/python2.7/site-packages/targetcli/__init__.py", line 18, in <module>
 from .ui_root import UIRoot
 File "/usr/lib/python2.7/site-packages/targetcli/ui_root.py", line 27, in <module>
 from rtslib_fb import RTSRoot
 File "/usr/lib/python2.7/site-packages/rtslib_fb/__init__.py", line 24, in <module>
 from .root import RTSRoot
 File "/usr/lib/python2.7/site-packages/rtslib_fb/root.py", line 26, in <module>
 from .target import Target
 File "/usr/lib/python2.7/site-packages/rtslib_fb/target.py", line 24, in <module>
 from six.moves import range
ImportError: cannot import name range

You will also get an error on target.service:

[root@server ~]# systemctl start target.service
Job for target.service failed. See 'systemctl status target.service' and 'journalctl -xn' for details.
[root@server ~]# journalctl -xn
-- Logs begin at Sun 2016-08-28 22:35:23 CEST, end at Sun 2016-08-28 22:44:29 CEST. --
Aug 28 22:44:29 server.yari.local target[2208]: from six.moves import range
Aug 28 22:44:29 server.yari.local target[2208]: ImportError: cannot import name range
Aug 28 22:44:29 server.yari.local systemd[1]: target.service: main process exited, code=exited, status=1/FAILURE
Aug 28 22:44:29 server.yari.local systemd[1]: Failed to start Restore LIO kernel target configuration.
-- Subject: Unit target.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit target.service has failed.
--
-- The result is failed.
Aug 28 22:44:29 server.yari.local systemd[1]: Unit target.service entered failed state.

By following the trace from the python error it seems clear that root cause is in the six.moves package that is provided by python-six rpm package. On RHEL 7.1 they have a newer version (and of course both targetcli/target work properly on RHEL 7.1, as you can test on AWS if you can still use a free subscription).

Best way to fix that issue is to simply upgrade that package to the newer version available from the repository:

yum update -y python-six

and you will see that target/targetcli will start working like on RHEL 7.1

Dependencies Resolved

==========================================================================================================================================
 Package Arch Version Repository Size
==========================================================================================================================================
Updating:
 python-six noarch 1.9.0-2.el7 base 29 k

Transaction Summary
==========================================================================================================================================
Upgrade 1 Package

Total download size: 29 k
Downloading packages:
No Presto metadata available for base
python-six-1.9.0-2.el7.noarch.rpm | 29 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
 Updating : python-six-1.9.0-2.el7.noarch 1/2
 Cleanup : python-six-1.3.0-4.el7.noarch 2/2
 Verifying : python-six-1.9.0-2.el7.noarch 1/2
 Verifying : python-six-1.3.0-4.el7.noarch 2/2

Updated:
 python-six.noarch 0:1.9.0-2.el7

Complete!
[root@server ~]# targetcli
Warning: Could not load preferences file /root/.targetcli/prefs.bin.
targetcli shell version 2.1.fb41
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.

/> ls
o- / ......................................................................................................................... [...]
 o- backstores .............................................................................................................. [...]
 | o- block .................................................................................................. [Storage Objects: 0]
 | o- fileio ................................................................................................. [Storage Objects: 0]
 | o- pscsi .................................................................................................. [Storage Objects: 0]
 | o- ramdisk ................................................................................................ [Storage Objects: 0]
 o- iscsi ............................................................................................................ [Targets: 0]
 o- loopback ......................................................................................................... [Targets: 0]
/> exit
Global pref auto_save_on_exit=true
Last 10 configs saved in /etc/target/backup.
Configuration saved to /etc/target/saveconfig.json

[root@server ~]# systemctl start target
[root@server ~]# systemctl status target
target.service - Restore LIO kernel target configuration
 Loaded: loaded (/usr/lib/systemd/system/target.service; disabled)
 Active: active (exited) since Sun 2016-08-28 22:50:38 CEST; 5s ago
 Process: 2367 ExecStart=/usr/bin/targetctl restore (code=exited, status=0/SUCCESS)
 Main PID: 2367 (code=exited, status=0/SUCCESS)
 CGroup: /system.slice/target.service

Aug 28 22:50:38 server.yari.local systemd[1]: Starting Restore LIO kernel target configuration...
Aug 28 22:50:38 server.yari.local systemd[1]: Started Restore LIO kernel target configuration.

Enjoy!

Migrate your self-managed wordpress blog to wordpress.com

 

I have been running this blog on a self-managed virtual host for a while (~2 years), then, two weeks ago, in order to save some money and lower the effort for the maintenance, I have decided to migrate it to wordpress.com where it is right now.

Theorically speaking wordpress provides various import / export tools (for many different sources), but I have immediately found an issue about the way media resources are restored. Well I have found two issues…

  1. I can’t say if it depends by a wrong configuration in my previous blob setup or because this is the supposed way to go, I have noticed that the images are linked with absolute URLs into the wordpress posts, and I have found no way to make them dynamic.
  2. Export “all contents” doesn’t export “all contents” despite that is what the import/export tool advertises on wordpress.

For issue 1 I have made some tests in order to identify the right absolute path to use, then I have applied the good old “search / replace” method in the exported xml file before uploading it to the new blog. It has been enough.

For issue 2 I had to make some more checks and I have found that it is required to explicitely select to export media files from the wordpress tool. Once you have exported them and imported them to the new blog, you will see wordpress.com grabbing all your media attachments from the old blog.

It took a couple of hours to get the point of both issues (and many attempts with their following manual cleanup, unfortunately there is no way to reset wordpress.com blob, or at least I haven’t found a way to do that withour cancelling the existing subscriptions on the blob itself – I have bought the domain mapping to yari.net…).

tl;dr

If you are facing any issue because the posts you have imported from your old wordpress blog to the new one are missing their shiny images, you must consider to proceed like below

  1. Check the absolute path your old posts use in order to link and load images, you have to look for HTML tags like <img title=”wp-1456610569051″ class=”alignnone size-full” alt=”image” src=”https://yaridotnet.files.wordpress.com/2016/02/wp-1456610569051.jpeg?w=768” originalw=”768″ scale=”2″> – That first path of the URL “https://yaridotnet.files.wordpress.com/&#8221; is what you’re going to use in order to fix your export XML file
  2. Export all your content from your old blog from the Tools / Exports area of wordpress administration dashboart.
  3. Made on the post of your old blog the same check you have done on point 1 and verify what’s the absolute URL your old blog uses
  4. Edit that exported XML file and replace the leading part of those URLs, e.g. mines were something like: crop-images (Everything up to the date has to be replaced)
  5. Upload your data again and you will be fine.
  6. Export from the old blog only the media (even if Tools / Export / All claims to export everything media won’t be loaded by the new blog, you need to proceed with an explicit media export / import)
  7. Import that second file into the new blog
  8. Have a look at the access log of your old wordpress blog (if you can) and you will see that after some times wordpress.com will start downlading your old media files to their new and right location.