Tuesday, 15 November 2011

Lots of data*.tar files in the author

Recently we had a major problem with the Author eating up disk space (> 110GB).

This was caused by the number of data*.tar files, in workspaces/crx.default/, increasing very rapidly BUT, the TAR Optimiser process was not able to keep up pace with it.

I asked DayCare if we could restore an online backup as a shortcut to removing the additional tar files. They said that would not help.

So, we just ran the TAR Optimiser process as much as we possibly could.

It was running at a rate of 1 tar file per 12 hours normally. But, over the weekend, that rate increased and it did manage to chew up ~200 data*.tar files. Amazing.

NB, When running the TAR Optimiser manually, I was able to set the delay from 1.0 milliseconds to 0.25 milliseconds - with a marked improvement in speed.

This delay parameter can be set in the workspace.xml file, so that it is used in your overnight schedule TAR Optimiser job, as follows :-

<PersistenceManager class="com.day.crx.persistence.tar.TarPersistenceManager">
<param name="optimizeSleep" value="0.25"/>
</PersistenceManager>

Thursday, 20 October 2011

Fixing a hanging author server

So, we had a problem where the author server would hang.

It would hang for 2-3 hours after startup ("the Loading" would be displayed in the main content area). Then it would work for up to 24 hours. Then the problem would come back and last forever.

The resolution was to apply hotfix 36021 - which had just a few pre-requisites. Ones in bold were already installed but needed re-installing in the correct order.

Do not implement the FineGrainedISMLocking performance change on CRX v2.x!

Installation instructions for the 27 Adobe hotfixes

After each hotfix, slowly and carefully :-

  • Check whether the bundles have stopped, if they have wait 10 minutes for them all to re-resolve & restart.
  • Look at which jar versions have been installed and see if the new version number is now listed in the bundles list.
  • Check if the Author application displays lists of web pages as you browse in the navigation tree
  • Check if the DAM tab displays lists of assets as you browse in the navigation tree.
  • Check that a web page (eg the homepage) displays ok in authoring mode.
  • Check the log file for errors.

NB, A hotfix number in bold means that it is probably already installed & needs just needs re-installing.

i.

Pre-requisite: Journal & Bundle Cache configuration changes are applied.

ii.

Initial Step: Turn off the replication agents

1.

HF

28211

12.03.10

2.

HF

29626

08.06.10

3.

FP

28358-1.2

06.07.10

Add -/tmp as per package description.

4.

FP

30015-1.0

24.07.10

5.

HF

30084

29.07.10

6.

FP

29944-1.0.1

03.08.10

Restart required after this one. The shutdown always hangs.

Check that all the bundles start when the server comes back up. You might need to start these bundles:-

org.apache.sling.api, org.apache.sling.commons.osgi, org.apache.sling.jcr.resource, com.day.cq.workflow.cq-workflow-console

7.

FP

30397

17.08.10

8.

HF

30518

23.08.10

9.

FP

30553

27.08.10

Restart required after this one.

NB this was not required on Test1 but we did it anyway.

10.

FP

31852

14.10.10

11.

FP

30035-2.0

15.10.10

A restart was required here (on test2) to pick up the new versions of the bundles.

12.

FP

29995-1.1

18.10.10

This stops 3 bundles (compat) - this is expected.

13.

FP

30532-2

17.11.10

This one takes ~10 minutes to recover the bundles.

14.

FP

31905

18.11.10

15.

FP

32186

29.11.10

16.

HF

32460

02.12.10

Restart required after this one. Is workflow-impl at 5.3.26 after 20 minutes (after the 1st restart)? Restart again 30 minutes after the first restart has returned.

There was no need to do the 2nd restart on Test1.

The “workflow-impl” jar needed manually starting after the restart.

17.

FP

31902-1.0

09.12.10

This one takes ~10 minutes to recover the bundles.

18.

FP

30249-3.0

10.02.11

A number of tagging & personalisation bundles stop here & then resolve themselves. Also there was a “Zip file closed” error in the logs.

19.

FP

31033-2.1

24.03.11

20.

HF

34460-1.0

25.03.11

Get a problem where the main content area in the author is not populated - this gets fixed by 34697 …

21.

FP

30815

05.04.11

22.

FP

34697-4.0

28.04.11

Many bundles stop & restart themselves here. Let it settle for 10-20 minutes.

23.

FP

34334-2.0

13.05.11

Cq-dam-core did not update to its new version number. Restart required.

It needed starting manually after the restart but it had picked up the right version.

24.

STOP & TAKE A BACKUP!

25.

FP

34901-3.0

13.05.11

There is no need to implement the ‘eventadmin.jar’ workaround anymore.

Wcm-core was at 5.3.68 and should go to v5.3.72.

There is no need to restart, since 34901-3 (3 Oct 2011).

26.

FP

34071

19.05.11

On Perf, had to wait 10 minutes for “the Loading symptom” to disappear.

On test1, 2 bundles stopped and had to wait 5 minutes to resolve themselves.

27.

FP

36021

Got a 500 error (TopLevelComponentContextImpl) when viewing web pages.

A restart fixes this.

28.

33200-9.0

29.

Restart the server for good measure!

30.

Final Step: Turn on the replication agents

Tuesday, 20 September 2011

If I were a CQ systems administrator....

If I were a CQ systems administrator....

Write a script to monitor the repository folder. How big is it? How many tar files are in today? What is the difference between yesterday's tar files total & todays? The motivation for this is that sometimes you get a server increasing in disk space and its because the TAR optimser isn't running for long enough.

grep the logs for the "last stage" of the online backup.

similarly for the TAR optimiser completion step.

Thursday, 1 September 2011

To enable logging for JSP pages

Could you also include the information on how we can add the org.apache.sling.commons.log.names values for those which are inside a jsp page. I believe the value needs to be something like apps.myapp if the center.jsp is inside apps/myapps/center.jsp.
However this is not working for me.

Hi, JSPs are compiled into packages below org.apache.jsp with the script path converted to further package parts. So in your example the /apps/myapps/center.jsp is compiled into the package: org.apache.jsp.apps.myapps.

Thus the logger must be setup with this prefixed package.

Thursday, 25 August 2011

Disk space growing on the CQ author server?

If the disk space on the CQ5 author server is growing at 1-2GBs per day, then check on the filesystem, to see where the growth is at.

If you find that the "journal" directory is GigaBytes then check out this article : http://dev.day.com/content/kb/home/Crx/Troubleshooting/JournalTooMuchDiskSpace.html

Symptoms

With a default FileJournal configuration in place, depending on the activity on the repository, over time, many Journal log-files will be created. This eventually may cause a disk space issue and performance problems in applications that use CRX.

Cause

The default configuration of the Journal theoretically allows for an unlimited number of rotated log-files.

Non-clustered environment resolution :-

In a non-clustered environment where CRX is running standalone, it is recommended to configure the maximum size of a Journal log-file to 100MB and limit the number of allowed files to 1. This is more than sufficient for such a setup.
<Journal class="com.day.crx.core.journal.FileJournal">
<param name="sharedPath" value="${rep.home}/shared"/>
<param name="maximumSize" value="104857600" />
<param name="maximumFiles" value="1" />
</Journal>

Note, made this change on an author which was struggling to stay up daily and it made a difference. Also we were able to cleanly shutdown the author after applying this change. Saved 35gb of disk space. Although, the journal file is still filling up 100MB every hour. Need to turn this off completely....

Checking disk usage in CRX

Use this URL to see what is using the most disk space in CRX :-

http://localhost:4502/etc/reports/diskusage.html?path=/content/dam

Friday, 8 July 2011

Adding accessibility 'skip to' links

<div id="accessibility">
<a href="#skiptonavigation" accesskey="n">Skip to navigation</a>
<a href="#skiptocontent" accesskey="c">Skip to content</a>
</div>

<a name="skiptocontent"></a>


#accessibility {display: absolute; }
#accessibility a
{
position:absolute;
left:-10000px;
top:auto;
width:1px;
height:1px;
overflow:hidden;
}
#accessibility a:focus
{
position:static;
width:auto;
height:auto;
}

from http://webaim.org/techniques/css/invisiblecontent/

Tuesday, 5 July 2011

Looping through the children of a specified node

Here is a JSP to dump out the child nodes of a specified parent node, & the properties of the child nodes.


<%request.setAttribute("silentAuthor", new Boolean(true));%><%
%><%@include file="/libs/foundation/global.jsp"%><%
response.setContentType("text");
response.setCharacterEncoding("utf-8");
%><%@ page import="java.util.Iterator,
com.day.cq.wcm.foundation.Paragraph,
com.day.cq.wcm.foundation.ParagraphSystem,
com.day.text.Text,
com.day.cq.tagging.Tag,
com.day.cq.wcm.api.PageFilter,
com.day.cq.tagging.TagManager,
com.day.cq.wcm.api.components.IncludeOptions,
java.util.Calendar,
java.util.Collection,
java.util.regex.*,
java.util.Map,
org.apache.sling.api.resource.ResourceUtil,
org.apache.commons.lang.StringUtils,
org.apache.commons.lang.StringEscapeUtils"%><%

//slingResponse.setContentType("application/xml;charset=utf-8");
slingResponse.setCharacterEncoding("utf-8");

String fullRecipeList = slingRequest.getParameter("pathsList");
if (!StringUtils.isEmpty(fullRecipeList))
{
String[] recipeList = fullRecipeList.split("\n");


for (int recipeCount=0; recipeCount < recipeList.length; recipeCount++)
{
if (StringUtils.isEmpty(recipeList[recipeCount].trim()))
{
continue;
}

// Strip off everything after the ".html"
int positionOfHtml = recipeList[recipeCount].indexOf(".html");
String currentRecipePath = "";
if (positionOfHtml > 0)
{
currentRecipePath = recipeList[recipeCount].substring(0, positionOfHtml);
}
else
{
currentRecipePath = recipeList[recipeCount];
}

Resource resRootPage = slingRequest.getResourceResolver().resolve(currentRecipePath);

if (resRootPage == null)
{
continue;
}

Iterator children = ResourceUtil.listChildren(resRootPage);

while (children.hasNext())
{
Resource resourcePage2 = children.next();

if (resourcePage2.getPath().indexOf("jcr:content") > -1)
{
continue;
}

Node n = resourcePage2.adaptTo(Node.class);
if (n == null)
{
continue;
}
%>
<%= n.getPath() %> <%
PropertyIterator resProps = n.getProperties();
if (resProps == null)
{
continue;
}

long nProps = resProps.getSize();
for (int i = 0; i < nProps; i++)
{
Property p = resProps.nextProperty();
if (!p.getDefinition().isMultiple())
{
%> <%= p.getName() %> <%= p.getString() %> <%
}
}
}
}
}
%>

Friday, 1 July 2011

Components need dialogs

Weird behaviour. I can't get a component to appear in the design view unless it has a dialog associated with it.

Wednesday, 22 June 2011

Accessing DAM Asset meta information

Here's some cool information from AK:-

To get the metadata in a simplefashion, you can use the standard sling json representation:
http://localhost:4502//jcr:content/metadata.infinity.json

For example:
http://localhost:4502/content/dam/geometrixx/banners/dsc.jpg/jcr:content/metadata.infinity.json

Docs on metadata:
http://dev.day.com/content/docs/en/cq/current/dam/metadata_for_digitalasset
management.html


There is also the Sharepoint connector for JCR as a separate product,
which might be useful if there is a Sharepoint involved:
http://dev.day.com/content/docs/en/crx/connectors/sharepoint/current.html

If you need further custom code on the CQ side to provide e.g. new
servlets / JSPs that expose dam assets in a custom way, here are some more
links:

General docs:
http://dev.day.com/content/docs/en/cq/current.html#Working%20with%20Digital
%20Assets%20in%20CQ%20DAM


CQ DAM API (Java):
http://dev.day.com/content/docs/en/cq/current/javadoc/com/day/cq/dam/api/pa
ckage-summary.html


Some more (developer) info on extending DAM:
http://dev.day.com/content/docs/en/cq/current/dam/customizing_and_extending
cq5dam.html


Regards,
Alex

Wednesday, 25 May 2011

w3c validation

I shouldn't need reminding about this.

Today, a tester showed me the validation report on a site I've been working on. Ouch!

Check your html here: http://validator.w3.org/

Add this as unit testing in the dev process.

Wednesday, 18 May 2011

Performance list

When approaching a new website launch, make sure you run through the Day performance & security checks.

http://dev.day.com/content/docs/en/cq/current/deploying/performance.html

CRX hotfixes

When approaching any new project, always ensure you have the CRX hotfixes installed. CRX v2.1.0.10 at least (although, there is 2.1.0.13 available).

NB, CRX hotfixes are cumulative - so you can just installv the latest one.

Wednesday, 27 April 2011

XSS Protection Service

The XSS Protection service in CQ v5.3 is great. However, having problems with it clashing with the xerces XML parser. From looking at the error on forum posts, it probably should not contain the xml apis package inside it's bundle.

Think I'll need to raise a Day Care ticket for this one.

Wednesday, 20 April 2011

Browser Bookmarks

I have created a very useful way to store my CQ environment bookmarks so that any system is easily accessible.

Create a folder in the Bookmarks Toolbar for each Environment. E.g.

Dev/ Sit/ UAT/ Perf/ Live/ Educ/

Then, in each environment, create a set of sub folders for CRX, Felix, etc...

Live/
CRX/ Felix/ Publish HTML/ Logs/ System Status/

Then put bookmarks to all of your CRX instances, for that environment, under the CRX folder, etc...

The beauty of this is that in browsers (Firefox 3.5 at least) you can quickly navigate to the "Live/CRX" folder and select "open all in tabs". Gaining fast access to Content Explorer when you have to go and look in multiple instances.

Tuesday, 19 April 2011

HTML page construction standards for performance

Started off with a page in my local dev environment. Jmeter script to fire 10 threads & make a total of 50 requests. All times in milliseconds.

First run:-
Ave: 765 90%:859 Max: 880

Combined 10 javascript files in to 1.
Second run:-
Ave: 653 90%:729 Max: 877

Minified the combined javascript file.
Third run:-
Ave: 610 90%:685 Max: 697

Moved the Omniture stuff to the bottom of the page (just before the usual "scripts.jsp" is included).
Fourth run:-
Ave: 554 90%:564 Max: 576

2 Development Standards to be adopted
Combine most of your javascripts in to 1 file.
Minify that javascript file.
Leave Omniture until last! (Preferably with an Ajax request onload).

Wednesday, 13 April 2011

Improving performance with browser caching

On requests, check that you are using browser caching.

Last-Modified should be set.
Expires should be set (seems to be dynamically set if max-age is set on cache-control).
On cache-control - watch out for must-revalidate - this seems to bypass the cache.

This is not good:-
Cache-Control max-age=0

This is OK:-
Cache-Control max-age=1800, public, must-revalidate

NB, public is to allow caching on SSL links.

This is BEST:-
Cache-Control max-age=18000, public

Are the flash files cached?

Tuesday, 12 April 2011

Online backups

The online backup seems to occur in 3 or 4 stages (not clear from the logs).

In the last stage, dor a large repository (40GB) threads seem to hang if anyone tries to login. However, the online backup will eventually finish and the hung threads are released to do work. No need to restart the server.

The start of the online backup looks like this (in system out log) :-
SystemOut O dd.mm.yyyy hh:mm:ss *INFO * Backup: Read size (nnnnnnnnn bytes) after nnnnnn ms (Backup.java, line 169)

The end of the online backup looks like this :-
SystemOut O dd.mm.yyyy hh:mm:ss *INFO * Backup: Copied last stage after nnnnnnnnnn ms (Backup.java, line 198)

(working on externalising the Data Store to reduce the online backup time by 70%).

_rep_policy.xml

If you have modified a _rep_policy.xml file and put it into a package, then when you install it, make sure to check the advanced settings / Access Control Handling.

If you know you have all the settings in the rep:Policy, then choose Overwrite.

Monday, 11 April 2011

Excluding folders from causing a dispatcher flush

So, you have reverse replication all set up and the usergenerated content is being replicated to all the other publish instances in your CQ estate.

However, upon closer examination of the logs, especially at peak time, you notice that the dispatcher cache is being invalidated too much. And the culprit is /content/usergenerated - the reverse replicated content!

To exclude this, create a user in the system and replicate it to the publish nodes, then modify the rep:policy XML for the /content/usergenerated node to deny this user jcr:all.

Finally, associate this user with the flush agent on the publish instances (the first tab of the user details dialog).

Thursday, 7 April 2011

Another useful Adobe hotfix

Adobe CQ hotfix for Replication Stabilization = 34595.

Another must have hotfix!

Adobe Hotfixes

The following hotfixes are a must have for making ratings scalable (if you use CQ ratings).

cq-5.3.0-hotfix-28211.zip
cq-5.3.0-featurepack-28358-1.2.zip
cq-5.3.0-featurepack-30397-2.0.zip
cq-5.3.0-hotfix-30518-1.0.zip
cq-5.3.0-featurepack-31033-2.1.zip
cq-5.3.0-hotfix-34541-1.1.zip

Wednesday, 6 April 2011

Setting up dispatcher

Here are some pointers to setting up a Day CQ dispatcher

On windows,
In dispatcher.any, set the docroot to something simple like
/docroot "c:/mywork/htdocs/cache"
Change the permissions on this folder so that apache can write to it.

In httpd.conf
DocumentRoot "C:/mywork/htdocs"

If there are rewrite rules required, then remember to turn the RewriteEngine on :-

RewriteEngine on
RewriteRule ^/?$ /content/projectx/en/home.html [QSA,PT,L]
RewriteRule ^(?!/content/)(.*).html$ /content/projectx/en$0 [QSA,PT,L]
RewriteRule ^(/css/).*$ /apps/projectx/docroot$0 [QSA,PT,L]
RewriteRule ^(/js/).*$ /apps/projectx/docroot$0 [QSA,PT,L]
RewriteRule ^(/images/).*$ /apps/projectx/docroot$0 [QSA,PT,L]

If you see the following error in the dispatcher.log :-
[Thu Apr 07 06:54:02 2011] [D] [1780(1444)] Filter rejects GET /css/projectx.css HTTP/1.1

Then the "/filter /glob" section is blocking the retrieval of the page from the publish server.