{"id":3567,"date":"2020-10-06T16:04:52","date_gmt":"2020-10-06T16:04:52","guid":{"rendered":"https:\/\/nenadnoveljic.com\/blog\/?p=3567"},"modified":"2020-10-06T16:06:43","modified_gmt":"2020-10-06T16:06:43","slug":"parallel-recovery-push-change","status":"publish","type":"post","link":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/","title":{"rendered":"parallel recovery push change"},"content":{"rendered":"<h1>VKTM<\/h1>\n<p>A summary of my <a href=\"https:\/\/nenadnoveljic.com\/blog\/transaction-start_time-internals\/\">previous blog post<\/a>: <i>VKTM<\/i> background process hung in <i>nanosleep<\/i> system call due to an OS bug, and stopped increasing the SGA variable <i>ksudbrmseccnt_<\/i>. Consequently, <i>v$transaction.start_time<\/i> is wrong, and AWR snapshots aren&#8217;t created anymore.<\/p>\n<h1>Parallel recovery<\/h1>\n<p>Parallel recovery on the standby database is another process which regularly calls <i>nanosleep<\/i>.<\/p>\n<p>I observed a parallel recovery process waiting on the <i>parallel recovery push change<\/i> wait event 100% of its time. This wait event isn&#8217;t documented, so I inspected it with DTrace.<\/p>\n<p>I&#8217;ve got the event code from the data dictionary:<\/p>\n<pre><code>select event# from v$event_name where name = 'parallel recovery push change' ;\n\n    EVENT#\n----------\n      1021<\/code><\/pre>\n<p>and embedded it into the following DTrace script:<\/p>\n<pre><code>#!\/usr\/sbin\/dtrace\n\n#pragma D option quiet\n\npid$target::kskthbwt:entry\n\/ arg1 == 1021 \/\n{\n  self-&gt;event = arg1 ;\n  printf (\"BEGIN wait event parallel recovery push change\\n\");\n  ustack();\n}\n\npid$target::kslwtectx:entry\n\/ self-&gt;event == 1021 \/\n{\n  printf (\"END wait event parallel recovery push change\\n\");\n  self-&gt;event = 0 ;\n}\n\npid$target::nanosleep:entry\n\/ self-&gt;event == 1021 \/\n{\n  printf(\"nanosleep \\n\");\n}<\/code><\/pre>\n<p>A short explanation of the script above: <i>kskthbwt<\/i> and <i>kslwtectx<\/i> are part of the Oracle wait interface API. They mark the beginning and the end of the wait event, respectively. <i>kskthbwt<\/i> receives the wait event code through the second argument (<i>arg1<\/i>).<\/p>\n<p>The output:<\/p>\n<pre><code>BEGIN wait event parallel recovery push change\n\noracle`kskthbwt\noracle`kslwtbctx+0x30e\na.out`kcrm_logmgr_idle_action+0x257\na.out`kcrm_wait_for_reply+0x55\na.out`krr_do_media_recovery+0xd1f5\na.out`krddmr+0xc7b\na.out`krd_do_mrp+0x855\na.out`krd_logmerger_driver+0x23a5\na.out`krp_slave_main+0xbc0\na.out`ksvrdp_int+0x94b\na.out`opirip+0x263\noracle`opidrv+0x24b\noracle`sou2o+0x97\noracle`opimai_real+0x9f\noracle`ssthrdmain+0x266\noracle`main+0xa7\na.out`0xb4c237b\n\nnanosleep\n\nEND wait event parallel recovery push change<\/code><\/pre>\n<p><i>nanosleep<\/i> is part of the code instrumented by this wait event. The functions on the stack <i>kcrm_logmgr_idle_action<\/i> and <i>kcrm_wait_for_reply<\/i>, and a <i>nanosleep<\/i> call might indicate that this is an idle wait event. Inspite of that, ASH records it.<\/p>\n<h1>Debugging<\/h1>\n<p>You can simulate a hanging <i>nanosleep<\/i> call with the debugger.<\/p>\n<p>First, set the conditional breakpoint on <i>kskthbwt<\/i>:<\/p>\n<pre><code>b kskthbwt if $rsi == 1021<\/code><\/pre>\n<p>Second, attach the parallel recovery process and continue:<\/p>\n<pre><code>ps -ef | grep pr00\n  oracle 27582 26345   0 13:38:25 ?          15:45 ora_pr00_DB<\/code><\/pre>\n<pre><code>attach 27582\nc<\/code><\/pre>\n<p>Third, after hitting the conditional breakpoint, set a breakpoint on <i>nanosleep<\/i> and continue:<\/p>\n<pre><code>Thread 42 hit Breakpoint 1, 0x000000000a09c140 in kskthbwt ()\nb nanosleep\nc\n\nThread 42 hit Breakpoint 2, 0x00007fff63e5e3d4 in nanosleep () from \/lib\/64\/libc.so.1<\/code><\/pre>\n<p>While waiting on the breakpoint, check the active session history:<\/p>\n<pre><code>select event,count(*) from v$active_session_history \n  where \n    program like '%PR00%' and sample_time &gt; sysdate - 1 \/ 24 \/ 60 \n  group by event order by count(*) desc ;\n\nEVENT                                                              COUNT(*)\n---------------------------------------------------------------- ----------\nparallel recovery push change                                            60<\/code><\/pre>\n<p>The parallel recovery process has spent all of its time waiting on <i>parallel recovery push change<\/i>.<\/p>\n<p>It returns to the normal operational mode after detaching the process:<\/p>\n<pre><code>detach<\/code><\/pre>\n<pre><code>select event,count(*) from v$active_session_history \n  where program like '%PR00%' and sample_time &gt; sysdate - 1 \/ 24 \/ 60 \n  group by event order by count(*) desc ;\n\nEVENT                                                              COUNT(*)\n---------------------------------------------------------------- ----------\nparallel recovery push change                                            39\n                                                                          5\ncontrol file sequential read                                              3\nlog file sequential read                                                  2\ncontrol file parallel write                                               1<\/code><\/pre>\n<p>Some <i>parallel recovery push change<\/i> waits are normal. But completely lacking any other activity might be an indication that the process is stuck in <i>nanosleep<\/i>.<\/p>\n<h1>Summary<\/h1>\n<p>In summary, a parallel recovery process would grind to a halt when a <i>nanosleep<\/i> system call hung. Parallel recovery process waiting on <i>parallel recovery push change<\/i> 100% of its time is an indication of a such problem.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A stuck <i>nanosleep<\/i> call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented <i>parallel recovery push change<\/i> wait event. <a href=\"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/\" class=\"more-link\">Continue Reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[24,27,5],"tags":[],"class_list":["post-3567","post","type-post","status-publish","format-standard","hentry","category-dtrace","category-gdb","category-oracle"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>parallel recovery push change - All-round Database Topics<\/title>\n<meta name=\"description\" content=\"A stuck nanosleep call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented parallel recovery push change wait event.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"parallel recovery push change - All-round Database Topics\" \/>\n<meta property=\"og:description\" content=\"A stuck nanosleep call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented parallel recovery push change wait event.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/\" \/>\n<meta property=\"og:site_name\" content=\"All-round Database Topics\" \/>\n<meta property=\"article:published_time\" content=\"2020-10-06T16:04:52+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-10-06T16:06:43+00:00\" \/>\n<meta name=\"author\" content=\"Nenad Noveljic\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@NenadNoveljic\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Nenad Noveljic\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/\"},\"author\":{\"name\":\"Nenad Noveljic\",\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/#\\\/schema\\\/person\\\/51458d9dd86dbbdd19f5add451d44efa\"},\"headline\":\"parallel recovery push change\",\"datePublished\":\"2020-10-06T16:04:52+00:00\",\"dateModified\":\"2020-10-06T16:06:43+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/\"},\"wordCount\":334,\"commentCount\":0,\"articleSection\":[\"DTrace\",\"gdb\",\"Oracle\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/\",\"url\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/\",\"name\":\"parallel recovery push change - All-round Database Topics\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/#website\"},\"datePublished\":\"2020-10-06T16:04:52+00:00\",\"dateModified\":\"2020-10-06T16:06:43+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/#\\\/schema\\\/person\\\/51458d9dd86dbbdd19f5add451d44efa\"},\"description\":\"A stuck nanosleep call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented parallel recovery push change wait event.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/parallel-recovery-push-change\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"parallel recovery push change\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/\",\"name\":\"All-round Database Topics\",\"description\":\"Nenad Noveljic\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/#\\\/schema\\\/person\\\/51458d9dd86dbbdd19f5add451d44efa\",\"name\":\"Nenad Noveljic\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a97b796613ea48ec8a7b79c8ffe1c685dcffc920c68121f6238d5caab5070670?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a97b796613ea48ec8a7b79c8ffe1c685dcffc920c68121f6238d5caab5070670?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a97b796613ea48ec8a7b79c8ffe1c685dcffc920c68121f6238d5caab5070670?s=96&d=mm&r=g\",\"caption\":\"Nenad Noveljic\"},\"sameAs\":[\"nenad-noveljic-9b746a6\",\"https:\\\/\\\/x.com\\\/NenadNoveljic\"],\"url\":\"https:\\\/\\\/nenadnoveljic.com\\\/blog\\\/author\\\/nenad\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"parallel recovery push change - All-round Database Topics","description":"A stuck nanosleep call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented parallel recovery push change wait event.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/","og_locale":"en_US","og_type":"article","og_title":"parallel recovery push change - All-round Database Topics","og_description":"A stuck nanosleep call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented parallel recovery push change wait event.","og_url":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/","og_site_name":"All-round Database Topics","article_published_time":"2020-10-06T16:04:52+00:00","article_modified_time":"2020-10-06T16:06:43+00:00","author":"Nenad Noveljic","twitter_card":"summary_large_image","twitter_creator":"@NenadNoveljic","twitter_misc":{"Written by":"Nenad Noveljic","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/#article","isPartOf":{"@id":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/"},"author":{"name":"Nenad Noveljic","@id":"https:\/\/nenadnoveljic.com\/blog\/#\/schema\/person\/51458d9dd86dbbdd19f5add451d44efa"},"headline":"parallel recovery push change","datePublished":"2020-10-06T16:04:52+00:00","dateModified":"2020-10-06T16:06:43+00:00","mainEntityOfPage":{"@id":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/"},"wordCount":334,"commentCount":0,"articleSection":["DTrace","gdb","Oracle"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/","url":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/","name":"parallel recovery push change - All-round Database Topics","isPartOf":{"@id":"https:\/\/nenadnoveljic.com\/blog\/#website"},"datePublished":"2020-10-06T16:04:52+00:00","dateModified":"2020-10-06T16:06:43+00:00","author":{"@id":"https:\/\/nenadnoveljic.com\/blog\/#\/schema\/person\/51458d9dd86dbbdd19f5add451d44efa"},"description":"A stuck nanosleep call impairs a parallel recovery process. The affected parallel recovery process spends 100% of its time in the undocumented parallel recovery push change wait event.","breadcrumb":{"@id":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/nenadnoveljic.com\/blog\/parallel-recovery-push-change\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/nenadnoveljic.com\/blog\/"},{"@type":"ListItem","position":2,"name":"parallel recovery push change"}]},{"@type":"WebSite","@id":"https:\/\/nenadnoveljic.com\/blog\/#website","url":"https:\/\/nenadnoveljic.com\/blog\/","name":"All-round Database Topics","description":"Nenad Noveljic","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/nenadnoveljic.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/nenadnoveljic.com\/blog\/#\/schema\/person\/51458d9dd86dbbdd19f5add451d44efa","name":"Nenad Noveljic","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/a97b796613ea48ec8a7b79c8ffe1c685dcffc920c68121f6238d5caab5070670?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a97b796613ea48ec8a7b79c8ffe1c685dcffc920c68121f6238d5caab5070670?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a97b796613ea48ec8a7b79c8ffe1c685dcffc920c68121f6238d5caab5070670?s=96&d=mm&r=g","caption":"Nenad Noveljic"},"sameAs":["nenad-noveljic-9b746a6","https:\/\/x.com\/NenadNoveljic"],"url":"https:\/\/nenadnoveljic.com\/blog\/author\/nenad\/"}]}},"_links":{"self":[{"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/posts\/3567","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/comments?post=3567"}],"version-history":[{"count":1,"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/posts\/3567\/revisions"}],"predecessor-version":[{"id":3576,"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/posts\/3567\/revisions\/3576"}],"wp:attachment":[{"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/media?parent=3567"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/categories?post=3567"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nenadnoveljic.com\/blog\/wp-json\/wp\/v2\/tags?post=3567"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}